SlideShare is now on Android. 15 million presentations at your fingertips.  Get the app

×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
 

GHTorrent: Github's Data from a Firehose

by Working at TU Delft on Jun 03, 2012

  • 1,721 views

A common requirement of many empirical software engineering studies is the acquisition and curation of data from software repositories. During the last few years, GitHub has emerged as a popular ...

A common requirement of many empirical software engineering studies is the acquisition and curation of data from software repositories. During the last few years, GitHub has emerged as a popular project hosting, mirroring and collaboration platform. GitHub provides an extensive REST API, which enables researchers to retrieve both the commits to the projects’ repositories and events generated through user actions on project resources. GHTorrent aims to create a scalable off line mirror of GitHub’s event streams and persistent data, and offer it to the research community as a service. In this paper, we present the project’s design and initial implementation and demonstrate how the provided datasets can be queried and processed.

Statistics

Views

Total Views
1,721
Views on SlideShare
1,590
Embed Views
131

Actions

Likes
2
Downloads
13
Comments
0

9 Embeds 131

http://www.gousios.gr 54
http://localhost 26
https://twitter.com 19
http://rubydoc.info 16
http://coderwall.com 6
http://www.verious.com 6
http://gousios.gr 2
http://kred.com 1
http://twitter.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
Post Comment
Edit your comment

GHTorrent: Github's Data from a Firehose GHTorrent: Github's Data from a Firehose Presentation Transcript