Nowadays, there is a rapid growth of open-source version control systems and repositories. A large number of new software projects are implemented, developed and maintained through these systems. Τhis way, software engineers can collaborate directly with each other, organize effectively and maintain an up-to-date history of the project’s evolution. Therefore, the volume of information stored is significant and its harnessing can lead to the development of smart and efficient systems. Within the context of this diploma thesis a machine learning system is developed, which stores, processes and groups source code changes that have taken place during the development stage, with the goal of extracting source code changes patterns. These patterns can act as recommendations for new projects, in order to optimize code development and/or fix potential bugs found repeatedly in project repositories. The proposed methodology was applied on the GitHub code hosting platform. GitHub tracks changes of source code files contained in a repository. These changes are represented as Abstract Syntax Trees (ASTs), so that the calculation of a similarity metric for the algorithmic structure can be achieved. Additionally, their semantic similarity is calculated and thus final clustering of source code changes is possible. Clusters that meet specific criteria, contain patterns of source code changes that can be used to provide recommendations for new software projects.