R is better than Python. Try to tell that to the banks

0

The most serious data scientists prefer R to Python, but if you want to work in data science or machine learning in an investment bank, you will probably have to put your predilection for R aside. Banks heavily use Python instead.

“Python is preferred over R in banks for a number of reasons,” says the New York-based data science manager at a major bank. “There is a greater availability of machine learning packages like sklearn in Python; it is better for generic programming tasks and is easier to produce; plus Python is better for data cleansing (like Perl used to be) and text parsing. ”

Because of this, he said banks have moved their data analysis almost entirely to Python. There are some exceptions: some strats jobs use R, but for the most part Python is predominant.

Nevertheless, R still has his fans. Jeffrey ryan, the former Citadel quant star is a big supporter of R and hosts an annual R conference in finance (canceled this year due to COVID-19). “R was designed to be data-driven and was designed by researchers,” says Ryan. “While Python co-opted R’s dataframe and time series, via Pandas [the open source software library for data manipulation in Python built by Wes McKinney, a former software developer at Two Sigma.]”

R is still used in statistical work and research, says Ryan. By comparison, Python is the “popular data analysis” tool and is easy to use without learning statistics. “Python found a whole new audience of programmers at the right time in history,” recalls Ryan. “When programmers (outnumber statisticians) want to work with data, Python has the appeal of a single language that ‘does it all’ – even if it technically doesn’t do any of it by design.”

Given the importance of data in financial services, one can assume that banks would favor the most proficient language, even if mastering it requires extra effort. However, Graham Giller, CEO of Giller Investments and former head of data science research at JPMorgan and Deutsche Bank, says banks have gone Python over R because banks’ IT departments are mostly run by IT people rather. only by people caring a lot about data.

“Personally, I really like R,” says Giller. “R is much more of a tool for professional statisticians, that is, people who are interested in inference about data, rather than computer scientists who are interested in code.” As bank IT professionals have gained traction, Giller says banks have “replaced quants with IT professionals or with quants who basically want to be IT professionals,” and they brought Python with them.

For pure financial mathematicians, this is a bit frustrating. Pandas was built on R’s back, but took a life of its own. “Pandas started out as a way to bring an R-like environment to Python,” says Giller, observing that pandas can be “horribly slow and inefficient” by comparison.

However, most people don’t care about this: the more Python and Pandas are used, the more use cases they have. “R has a relatively smaller user base than Python at this point,” says Ryan. “This in turn means that a lot of tools are starting to be built around python and data, and that builds on its success.”

Do you have a confidential story, tip or comment you would like to share? Contact: sbutcher@efinancialcareers.com in the first place. Whatsapp / Signal / Telegram also available.

Please indulge us if you leave a comment at the bottom of this article: all of our comments are human-moderated. Sometimes these humans may be asleep or away from their desks, which may take some time for your comment to appear. Ultimately, it will – unless it’s offensive or defamatory (in which case it won’t.)

photo by Vitaly Vlasov of Pexels


Source link

Share.

About Author

Leave A Reply