I get this question often when I’m talking about how important it is for digital humanists to be able to write their own code. It’s an excellent question, and there’s no one right answer for every researcher. I’ve written websites and research/data tools in all three, though, and I recommend Python for beginning researchers in digital humanities (and computational social science, really). Here’s why:
- The syntax is way easy to read, and you can get the basics pretty easily.
- It’s straightforward to setup and run on Mac, Linux, or Windows PCs.
- The official docs and community of developers is big and helpful.
- It has many, many modules (code that other people have written) that will help you and mean you need to learn only one language. For instance, you can import well-documented modules to
- Your collaborators, especially if they’re computer scientists, will probably know/prefer Python too.
What about PHP? Doesn’t everyone [on Craigslist, on Monster] want PHP developers?
PHP is a very useful language to learn if you want to do website development and maintenance work, but that’s not my gig anymore. I haven’t used PHP, except to hack my WordPress blog, in years. I do have some legacy code for collecting online discussion data in PHP though.
But Ruby’s hot. Why not Ruby?
Ruby’s gotten wicked fast, but she just doesn’t have the modules or community to make life easy for a beginning programmer. Again, I’m not building websites anymore, so I’m talking about what to learn if you need to roll your own code to do your research where that research involves collecting, cleaning, munging, and/or analyzing data. The modules that do support these activities are improving (see Stanford Core NLP for text analysis, for instance), but they still lag far behind Python in features. I also find setting up and managing Ruby on my machines to be a real pain-in-the-ass, especially if I need to switch back and forth between a Windows box and my Mac laptop. Once you’ve mastered the Python you need, if you’re looking for a second language or are seduced by Rails over Django for your web project, sure, learning Ruby would be great. But it probably won’t help you get your dissertation written or finish your first large-scale text analysis project.
I heard R was the way to go.
R also runs easily on Mac, Linux, or Windows, so that’s useful. It has some GUIs that are helpful for newbies too. But, R does not make it easy to collect or clean. If all the data you work with is already clean and formatted the way you like it, R could be right for you. I’d recommend R over Ruby for data analysis, actually. I’m sticking with Python for the whole data analysis pipeline though – she’ll get you from “I have no data” to “Publication accepted” without having to switch to another language. Whenever I use R, I have to clean my data in Python anyway, so I’ve basically stopped using R for analysis so I don’t have to keep switching back and forth.