Salient findings of the report uncover what is and isn't working in the data science field. These findings include:
- "Data science" is a new term for something that's been around for a while. While the term "data science" is relatively new, 16 percent of data scientists reported that they have worked in this field for 10 years or more. This suggests that "data science" is a new term that describes something that IT professionals have been doing for many years.
- Messy, disorganized data is the number one obstacle holding data scientists back. Two-thirds of respondents say cleaning and organizing data was the least interesting and most time-consuming task, taking time away from more preferred tasks, such as predictive analysis and data mining.
- There are not enough data scientists. Nearly 80 percent of respondents indicate there is a shortage of data scientists, suggesting that an increase in qualified data scientists would enable companies to balance workload and improve overall breadth and depth of their data science capabilities.
- Data scientists want more support from their companies. Nearly 79 percent of respondents are satisfied in their jobs, with almost one-third finding their position "totally awesome," but noting that their organizations can still do more to better equip them. Data scientists said that organizations can empower data science teams by providing the proper tools to do their job better (cited as a solution by 54.3 percent of survey respondents) and setting clearer goals and objectives on projects (cited by 52.3 percent of respondents).
- Data scientists use a diverse toolkit dominated by open source. The survey found that although Excel is still the most commonly used tool (by 55.6 percent of respondents), data scientists also use at least 47 other tools and languages to do their jobs. Nearly all data scientists (98 percent) use open source software, and tried-and-true open source languages such as R remain major parts of data scientists' toolbox.
- The most in-demand data science skill set is programming and coding. In addition to the survey that was conducted, CrowdFlower used its own data enrichment platform to collect and analyze 1,024 LinkedIn data scientist job postings and found that the top two skills companies are looking for are programming and coding (seen in 55.3 percent of job postings) and statistical tools (seen in 52.1 percent of job postings).
"We know that data scientists are valuable for their companies, but there's still a disconnect between what they actually do and what they want to do," said Lukas Biewald, co-founder and CEO of CrowdFlower. "At the end of the day, the time they invest in cleaning data is time that could be better spent doing strategic, creative work like predictive analysis or data mining. If companies can give data scientists some of that data cleaning time back, they'll have happier teams that can focus on really exciting things."
Download the CrowdFlower 2015 Data Scientist Report and "Data Behind Today's Data Scientists" Infographic
- The complete survey findings are available in the CrowdFlower 2015 Data Scientist Report, which can be downloaded here: http://goo.gl/XavwZr
- A high-level overview of the results is visually illustrated in CrowdFlower's "Data Behind Today's Data Scientists" infographic, which can be downloaded here: http://goo.gl/zVyXBE
A total of 153 General Population respondents from CrowdFlower's online research panel completed the survey. Respondents work for companies of varied sizes and sectors, mostly in the U.S. All respondents have "data scientist" in their job title or job description on LinkedIn.
Interact with CrowdFlower
- Connect with CrowdFlower on LinkedIn: https://www.linkedin.com/company/crowdflower
- Follow CrowdFlower on Twitter: https://twitter.com/CrowdFlower
- Like CrowdFlower on Facebook: https://www.facebook.com/CrowdFlower
- Join CrowdFlower on Google+: https://plus.google.com/116115971590569804635/posts
Founded in 2009, CrowdFlower is the leading data enrichment platform for data scientists. Its quality-control technology is the most accurate and fastest way to collect, label, and clean data from an on-demand workforce. The platform automates the management of these online contributors to tackle tasks that require human intelligence such as search relevance tuning, data categorization, image annotation, metadata creation, sentiment analysis, transcription and de-duplication. Backed by Trinity Ventures, Bessemer Venture Partners, Harmony Partners and Canvas Venture Fund, CrowdFlower has over 150 customers including Unilever, Autodesk, eBay, Edelman, YP.com and VoiceBox.
Media Contact Vivian Shic Bhava Communications for CrowdFlower Email Contact 925-323-9382