Share to

You can earn money or coins from your share:)

Tips: you can use Sparticle for uploading away3D effects.

Tips: you can download Sparticle for uploading effects.

Tips: The ActionScript editor is supporting Away3D, Starling, Dragonbones and Flex frameworks.

Tips: paste the web page URL then click button:) Your Best Source for Gaming
Login    or

Python developers are the most giving

GitHub, you probably know, has quickly become the main (or, at least, the most well-known) repository for open source software. It currently hosts millions of code repositories for hundreds of programming languages supported by several million users.  The sheer volume of activity on GitHub reflects the growing popularity of open source software and the commitment of many people to working together to improve code across many different programming languages.

But I wondered recently if some programming languages tended to generate more open source contributions than others. Thanks to Google BigQuery, I was able to poke around raw GitHub Archive data myself to look into this further. Specifically, to try and quantify this, I looked at the average number of pull requests opened per GitHub repository by programming language. I thought that would be a good (but certainly not perfect) proxy for measuring the number of contributions (or attempted contributions) to a code base by someone other than the repository owner.

First, let me present the results.

Bar chart showing average number of pull requests per GitHub repository by programming languages. The results: Python .94, PHP .83, CoffeeScript .78, JavaScript .74, C++ .74, Ruby .68, C .65, Perl .64, Go .64, Java .58, C# .46, Shell .45, Objective-C .41, CSS .29, VimL .20

Python generates the most pull requests, on average, per GitHub repository

Image credit: ITworld/Phil Johnson

Now, here's my methodology

  • As mentioned, I queried the GitHub Archive using Google BigQuery which, at the time, covered (roughly) GitHub activity from March 11, 2011 through March 14, 2014.

  • First, I queried the number of (non-forked) repositories per programming language using the following:

    SELECT repository_language, count(distinct repository_url) as cnt
    FROM [githubarchive:github.timeline]
    WHERE repository_fork == "false"
    group by repository_language;

    This gave me results for 150 programming languages covering over 4.1 million repositories (I ignored repositories with no programming language specified), for an average of 27,473 repositories per language.

  • I then queried the number of pull requests opened per programming language using the following:

    SELECT repository_language, count(*) as cnt
    FROM [githubarchive:github.timeline]
    WHERE repository_fork == "false"
    AND type="PullRequestEvent" and payload_action="opened"
    group by repository_language;

    Again ignoring repositories without a programming language specified, this gave me a total of just under 2.8 million pull requests across the 150 programming languages, for an average of 20,567 pull requests per language.

  • Overall, for the 4.1 million repositories with a programming language specified, the average number of pull requests opened per repository was .67.

From the results, then, we see that Python repositories are generating the most pull requests, on average, with .94 per repository. It's interesting that, using this measure, the Python community is the most giving, though the language currently only ranks as the eighth most popular programming language based on the latest TIOBE rankings. So, the Python community may be smaller than, say the Java community (#2 on the TIOBE list), but this suggests it's a tighter group.

After Python, we see that PHP (.83), CoffeeScript (.78), JavaScript (.74) and C++ (.74) also generate an above average number of code contributions. Of these languages, C++ had the highest TIOBE ranking (#4). The top 3 languages on the current TIOBE list all scored below average on the number of pull requests: C (.65 pull requests/repository, #1 on TIOBE), Java (.58, #2) and Objective-C (.41, #3).

Does this really mean that the Python community is more helpful than other language communities? Not necessarily, of course. Using the number of GitHub pull requests as a proxy for measuring outside (non-repository owner) contributions is far from perfect. Pull requests can come from outsiders forking and updating a repository, or from other project owners working on the same project but using a shared repository model. Maybe Python developers are more likely to use shared repositories and pull requests as development methodology.

Still, I think the results are interesting and suggest that, if you want to choose a programming language for a project that has a large, active and helpful community of developers behind it, you could do a lot worse than Python.


You must Sign up as a member of Effecthub to view the content.

1607 views    0 comments

You must Sign up as a member of Effecthub to join the conversation.


Or Login with Your Email Address:

Or Sign Up with Your Email Address:
This field must contain a valid email
Password should be at least 1 character