The popular image of data science -- elegant statistical models surfacing hidden insights from vast datasets, changing business strategy overnight -- bears little resemblance to the typical data scientist's Tuesday. The gap between the marketing materials and the lived experience of the job is wider in data science than in almost any other technology role, and that gap has real consequences: people enter the field with wrong expectations, get frustrated quickly, and either leave or struggle to perform.
Understanding what data scientists actually do on a typical day -- including the parts that nobody mentions in recruiting videos -- matters enormously for career decisions. If you love modelling and find data cleaning tedious, you should know that your day will be dominated by cleaning. If you dislike long stakeholder meetings, you should know that the senior versions of this role involve more communication and less coding, not less.
This article covers what data scientists actually do across a typical week, how the work differs by seniority, how the daily experience varies dramatically between startups, large tech companies, and consulting environments, and what the realistic project cycle looks like from discovery to deployment.
"If I had to describe data science in one sentence, I'd say it's mostly being a janitor who occasionally gets to do science." -- Hillary Mason, founder of Fast Forward Labs and former Chief Scientist at Bitly, 2015 interview
Key Definitions
Data cleaning: The process of identifying and correcting errors, inconsistencies, null values, duplicates, and formatting issues in raw data before analysis or modelling. This typically consumes the majority of a data scientist's working time.
Stakeholder management: The ongoing process of communicating with business partners, product managers, and executives about analytical findings, project status, and data limitations. A critical skill that grows in importance with seniority.
Exploratory data analysis (EDA): The initial phase of any analysis project -- understanding data structure, distributions, relationships, and anomalies before formulating hypotheses or building models.
Technical debt: Accumulated shortcuts in data pipelines, model implementations, or code quality that must eventually be addressed. Data scientists working in resource-constrained environments accumulate significant technical debt.
Model monitoring: Ongoing tracking of deployed model performance to detect data drift, degraded accuracy, or unexpected behaviour in production. An often-overlooked part of the job after initial model deployment.
The 80% That Nobody Talks About
The most consistent finding across surveys of data scientists is the time allocation breakdown. The 2022 Kaggle State of Data Science survey found that respondents spend an average of 26% of their time on data collection, 19% on data cleaning, 18% on building and selecting models, 11% on visualising data, and 9% on putting models into production. The two cleaning-related categories together account for approximately 45% of time -- and that is before you count the debugging and investigation required when data pipelines break.
A 2016 survey by CrowdFlower famously found that data scientists spend 60% of their time cleaning and organising data, and reported it as their least favourite part of the work. Years later, little has changed structurally. The tools have improved -- dbt, Great Expectations, and modern data warehouses make data quality management more tractable -- but the underlying problem remains: real-world data is messy in ways that cannot be fully automated away.
What does this look like in practice? It looks like spending half a day tracing why a conversion metric in the analytics table suddenly changed, only to discover that a product team changed how they log events three weeks ago without updating the documentation. It looks like a dataset that has customer_id in three different formats depending on which source system it came from. It looks like revenue figures that do not reconcile between the finance system and the analytics warehouse.
How Time Is Actually Spent
| Activity | Expected Allocation | Actual Allocation (Survey Data) |
|---|---|---|
| Data cleaning and preparation | 10-20% | 45-60% |
| Exploratory data analysis | 20-30% | 15-20% |
| Model building and tuning | 30-40% | 10-15% |
| Communication and presentation | 10-15% | 5-10% |
| Infrastructure and pipeline work | 5-10% | 20-30% |
A Junior Data Scientist's Typical Day
Junior data scientists (roughly 0-3 years of experience) spend most of their time executing defined tasks with close supervision. The day typically looks something like this:
Morning (9:00-11:00): Team standup (15-30 minutes), followed by hands-on work -- likely data exploration or cleaning on an assigned project. A junior DS might spend this time investigating why a model's output changed after a recent data pipeline update, writing SQL queries to understand the data, and documenting findings.
Late morning (11:00-12:00): Collaboration time -- a one-on-one with a senior data scientist to discuss approach, or a cross-functional meeting with a product manager who has requested an analysis.
Afternoon (13:00-16:00): Execution work. Feature engineering, model training runs, writing code, reviewing output. Jupyter notebooks are common at this stage, though companies with better engineering culture encourage moving to proper Python scripts with tests.
Late afternoon (16:00-17:30): Documentation, code reviews, updating project trackers, reviewing feedback on a presentation or analysis shared earlier in the week.
The coding-to-meeting ratio at junior level is typically higher than later career stages -- perhaps 60-70% heads-down technical work, 30-40% communication and collaboration. The technical work often feels less glamorous than expected: more debugging, more data investigation, less clever model building.
A Senior Data Scientist's Typical Day
Senior data scientists (roughly 7+ years) have a fundamentally different day. The shift is from executing tasks to defining problems, influencing decisions, and enabling others. Many senior data scientists report that they write substantially less code than they did at mid-level, and substantially more documents, presentations, and design reviews.
Morning (9:00-10:30): Two or three back-to-back meetings. A project kickoff with a product team discussing analytical requirements. A review of a junior DS's model design with detailed feedback. A leadership review meeting presenting results from a recent pricing analysis.
Mid-morning (10:30-12:00): Deep work on a complex analysis or strategy document -- the kind of work that requires sustained focus and is impossible to do well in 30-minute fragments. Calendar blocking is essential at senior levels.
Afternoon (13:00-15:00): Continued deep work if the schedule allows, or more meetings. Review of a junior team member's code. A conversation with a data engineer about pipeline requirements for an upcoming project.
Late afternoon (15:00-17:30): Writing -- a project postmortem, a recommendation memo, a one-page summary of findings for executive consumption. Responding to async questions from across the team.
The meeting load at senior level is substantial. Three to five hours per day in meetings is not uncommon, and poor calendar management can effectively eliminate all productive technical work. The most effective senior data scientists are disciplined about protecting deep work time through explicit calendar blocking.
Startup vs Big Tech vs Consulting: How the Day Differs
| Environment | Breadth | Depth | Infrastructure | Impact Visibility |
|---|---|---|---|---|
| Early-stage startup | Very high | Lower | Build it yourself | High |
| Mid-stage startup | High | Moderate | Developing | Moderate |
| Large tech company | Low (specialised) | High | World-class | Low (one of many) |
| Consulting firm | High (client variety) | Low | Client-dependent | Moderate |
Early-Stage Startup (under 200 employees)
At a startup, the data scientist is often the entire data function. This means building infrastructure that a big-company data scientist would never touch -- setting up the data warehouse, writing the logging code, creating the first dashboards from scratch. The breadth is genuinely exciting for people who like owning things end to end, but it is relentlessly demanding.
A typical startup data scientist day involves context switching between writing a SQL data model in dbt in the morning, debugging a broken Airflow pipeline after lunch, and presenting a cohort retention analysis to the leadership team at 4pm. The variety is high; the depth on any one thing is lower than a more specialised role would allow.
Large Technology Company (FAANG tier or comparable)
At a large tech company, roles are highly specialised. A data scientist at Google does not build pipelines -- a data engineer does that. The modelling work can be genuinely sophisticated, the data volume is enormous, and the infrastructure is world-class.
The tradeoffs are different. Big-tech data scientists often report feeling removed from business impact -- they are one of many contributors to a metric that is itself one of many metrics on a dashboard. The organisational overhead is high: reviews, design documents, alignment meetings, and approval processes that can slow a project from conception to production significantly.
Consulting
Consulting data scientists (at firms like McKinsey, BCG, Deloitte, or boutique analytics shops) work across multiple client engagements, often in 3-6 month project rotations. The pace is high, the client variety is stimulating, and the business impact is often visible quickly.
The downsides are equally significant: the technical depth is often limited because projects are too short to build truly sophisticated systems, the data is frequently incomplete or poorly documented, and the hours can be punishing during crunch phases.
Project Phases and What Each Feels Like
| Phase | Typical Duration | Primary Activity | How It Feels |
|---|---|---|---|
| Discovery | 1-2 weeks | Stakeholder meetings, requirement scoping | Engaging, light on coding |
| Data exploration | 1-4 weeks | SQL, EDA, data quality investigation | Often frustrating |
| Modelling | 1-3 weeks | Feature engineering, training, evaluation | The "fun" phase |
| Review and refinement | 1-3 weeks | Stakeholder iteration, validation | Slow, process-heavy |
| Communication and deployment | 1-2 weeks | Presentations, handoff to engineering | Underestimated in time |
The gap between expected and actual time in discovery and exploration phases is one of the biggest surprises for new data scientists. The modeling phase -- which people imagine as core data science work -- is typically shorter than the data wrangling that precedes it.
What Changes with Company Data Maturity
One dimension that rarely gets discussed honestly is the relationship between data scientists and their data. At companies with immature data infrastructure, data scientists spend enormous effort simply gaining access to the data they need -- navigating permissions, working around missing documentation, and dealing with pipelines that break regularly. At mature data organisations, the infrastructure is reliable enough that the modelling work can genuinely be the focus.
This structural difference makes the maturity of a company's data culture one of the most important factors to evaluate when considering a role. An impressive-sounding title at a company where data is not taken seriously means spending most of your time fighting infrastructure problems.
Practical Takeaways
If you are entering data science, mentally prepare to spend significantly more time on data cleaning and less time on modelling than you expect. This is not a temporary inefficiency you will work your way out of -- it is the nature of the work.
Evaluate potential employers not just on their role title and compensation but on the maturity of their data infrastructure. Ask in interviews: "What does the data pipeline look like? Who maintains it? How often do pipelines break?" The answers reveal the true daily experience.
Senior data scientists who want to stay technical need to actively protect their time. Calendar blocking for deep work is not optional at senior levels -- it is how you continue to do the work that makes you effective.
The communication skills that nobody emphasises in school -- writing clearly, presenting to non-technical audiences, structuring ambiguous problems -- matter more at every career stage than most data scientists expect when they start out.
References
- Kaggle. (2022). State of Data Science and Machine Learning Survey. Kaggle.
- CrowdFlower (Figure Eight). (2016). Data Science Report: Data Preparation and Cleaning.
- Mason, H. (2015). Data Science in Practice. Interview transcript, O'Reilly Strata Conference.
- Lorica, B. (2019). The Data Scientist's Reality: What We Do and What We Want. O'Reilly Media.
- Huyen, C. (2022). Designing Machine Learning Systems. O'Reilly Media.
- Yan, E. (2023). Applied Scientist Reflections: First Year to Staff Level. ApplyingML Newsletter.
- Bowne-Anderson, H. (2018). What Data Scientists Really Do, According to 35 Data Scientists. Harvard Business Review.
- Rogati, M. (2017). The AI Hierarchy of Needs. Hackernoon.
- Stack Overflow. (2024). Developer Survey: Daily Work Experience Section.
- Davenport, T. and Patil, D. (2012). Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review.
- Sculley, D., et al. (2015). Hidden Technical Debt in Machine Learning Systems. NIPS Proceedings.
- Kleppmann, M. (2017). Designing Data-Intensive Applications. O'Reilly Media.
Frequently Asked Questions
How much time do data scientists spend cleaning data?
Industry surveys consistently show data scientists spend 45-60% of their time on data collection, cleaning, and preparation -- leaving only 20-40% for the modelling and analysis work most people imagine when they picture the role.
How many meetings do data scientists have?
Most data scientists spend 2-4 hours per day in meetings. Senior data scientists typically have heavier meeting loads -- sometimes 3-5 hours daily -- across requirements discussions, stakeholder reviews, and team standups.
Is data science more creative or analytical?
Both, at different stages. Problem framing and feature engineering require creativity; model evaluation and statistical testing require analytical rigour. The ratio shifts depending on the project phase.
How does a startup data scientist day differ from big tech?
Startup data scientists own a much broader range of tasks -- pipelines, dashboards, modelling, and stakeholder work. Big tech data scientists have specialised roles with better infrastructure support but often less direct visibility into business impact.
What is the most frustrating part of being a data scientist?
Data quality and data access issues are the primary frustration cited by most data scientists -- spending weeks cleaning data or waiting for engineering to fix pipelines before the actual analytical work can begin.