How this data scientist stays up to date in a fast-moving industry

Data science is constantly evolving, but Liberty IT’s Naomi Hanlon discusses how she keeps up and shares her advice for early-career data scientists.

Click here to view the full Data Science Week series.

Having worked as a data scientist for more than four years, Naomi Hanlon has experience in various industries from manufacturing and mobility to insurance. She currently works at Liberty IT.

For her, a typical day starts at 8am, when she catches up on emails and starts work on the day’s tasks, along with a short meeting with her team.

‚ÄúDepending on the type and stage of project, the rest of my day will comprise some combination of meetings, data cleaning, exploratory data analysis, feature creation, creating presentations, presenting, collaborating, writing reports, code reviews, code walkthroughs and onboarding. “

Hanlon is currently trying hybrid working, which means half her time is spent at home and the other half is spent in the office. She also does a compressed week, meaning four longer days and Fridays off.

‘Data science is a fast-moving field so trying to keep up to date with the latest research is an impossible task’

What types of data science projects do you work on?

One of the first projects I worked on at Liberty IT was migrating existing models from SAS into Python. It doesn’t sound like the most exciting project from a data science perspective, but it was one of my favorites because I learned so much from this project and it has helped me in the ones that followed.

Being paired with an engineer allowed both of us to learn a lot about each other’s domain. As a result, I was writing much cleaner code – modular and with unit tests (shocking).

We had a consistent branching strategy in Git and performed code walkthroughs, pull requests, code reviews. Collaboration between data scientists and engineers is the norm for us now but it was all new to me at the time. This project changed how I write code for data science.

Most recently we worked on the first continuous learning pipeline in our company. Typically, model performance declines over time, due to shifts in customer appetite, needs or trends.

Usually, these less performant models will require data scientists to retrain them using more recent data. This is time consuming and requires a data science resource. Continuous learning addresses this problem.

Essentially, it is a repeatable pattern that allows models to adapt in production. It uses the most recent data to retrain the model on a frequent basis. This improves efficiency, as data scientists can use their time to answer new business questions rather than refitting models. It also ensures that models are staying up to date and providing more reliable predictions.

We got to use new (to us) tools such as Managed MLflow and Luigi to build out the pipeline and the outcome is a repeatable pattern that can be extended out to other areas.

What skills do you use on a daily basis?

On the technical side, I would rely on Python most days. Although I love R, Python seems more accessible for cross-functional teams.

Communication is often talked about as an important skill in many different roles. For data science, this means having to communicate your findings or approach effectively. The unexpected skill is being able to pitch your information to the right level for your audience.

Some stakeholders will want the high-level findings and conclusions, an executive summary. More technical audiences will appreciate the granular information – what metrics, packages and methodology was used. Engineers will often just appreciate going straight into the code.

Business needs drive data science. The ability to develop a strong understanding of the problem, translate the question into an experimental design and deliver potential solutions which form part of the wider business strategy is key. This keeps us data scientists in a job.

What are the hardest parts of working in data science?

Data science is a fast-moving field so trying to keep up to date with the latest research is an impossible task. Blogs and podcasts are a great starting point and make new concepts accessible.

In work, we have a number of knowledge-sharing sessions within our data science teams to allow folks to share anything interesting. This can be something they have come across or have been working on. This has also been useful for developing best practices as we are continually sharing work and feedback.

I’ve already touched on how important communication is in data science. One aspect of communication I find difficult is presenting. Working remotely has been the best thing for me, where I can have prompts on screen as support while giving a presentation.

There have been a lot of people over the years tell me that the more you do it, the easier it becomes. I hated that. I also am not fond of the fact that they were likely right. In my current role, I have presented more than I ever have and it really does get easier with exposure. Working remotely really can’t be understated though, it’s been great.

Do you have any productivity tips that help you through the day?

I have a terrible memory, so I rely on jotting things down. Using a few minutes at the end of the day to make a list of the upcoming tasks has helped.

In true data nerd fashion, I have started tracking my habits each day as well. These are things I want to spend my free time doing like exercise, reading and listening to podcasts. Phones can be such a time suck, where an hour can fly by with nothing to feel good about after. Tracking habits, so far, has helped me spend my time doing something a little more interesting.

Having these habit goals help with productivity in work as well. I find that I am much more likely to go on a walk or read a chapter on my lunch, just for the simple fact that I can track it. It gives me a real break from screens, and I get to add another data point to the habit tracker. Win-win.

What skills and tools are you using to communicate daily with your colleagues?

We haven’t been immune to the Slack v Teams debate but have landed on Teams for project work. It’s more accessible company wide.

I’ve talked about knowledge-share sessions and code walkthroughs – these all take place on Teams with screen share. We do use Slack, but it is usually more informal communication with channels for cooking, exercising, gaming.

What do you enjoy most about working in data science?

At the core of it, I really just enjoy trying to solve things. This includes puzzles, riddles, Rubik’s Cubes – anything. This is also probably why I love crime dramas and thrillers, the element of investigation and being part of unravelling what is really happening. This is a big part of what I get to do as a data scientist – there is a question and we use data to try answer it and solve the problem.

Specifically in this role at Liberty IT, I have a lot of autonomy over my work. This gives space to experiment, learn and iterate, which is important in data science.

Typically, we work on short term engagements, meaning my work is varied as well. I get to apply different approaches and techniques depending on the business area and their goals.

What advice would you give to someone who wants to work in data science?

My first bit of advice is to get involved in data science. Vague, I know. We all come to data science with different backgrounds so find something that works for your level of interest and the time you have available to dedicate to something new.

As an entry point, there are plenty of online resources you can dip into. Tutorials, courses and YouTube videos. With that, there are plenty of open data sets available to get you started with the basics, reading in data, visualisations, exploratory data analysis.

Past the basics, one of the best things you can do is start trying to use a real data set to answer a question you are interested in. Tutorials and toy data sets will get you so far but start building on these skills with a project which you have defined.

Being part of a wider community will keep you engaged in the data science world and is a great way to learn more – there’s lots of ways to get involved including online communities, meetups, forums and Kaggle.

Making connections and learning about the day-to-day work will help direct you towards more specific goals. This will help determine what skills and qualifications you need as you progress on your data science journey.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily BriefSilicon Republic’s digest of need-to-know sci-tech news.

Leave a Reply

Your email address will not be published.