Lauren Burke 00:04
Welcome to Women in Analytics After Hours, the podcast where we hang out and learn with the WIA community. I'm your host, Lauren Burke, and I'd like to thank you for joining us today. Today, we are coming to you live from the 2022 DataConnect conference hosted in Columbus, Ohio by WIA.
For our very first episode, I'm excited to have Elizabeth Gilbert joining us to chat about topic she's very passionate about. Elizabeth is a Product Data at Olive and the organizer of the Cincinnati Data Science meetup. Formerly, she helped lead the Big Data and Analytics Association at the Ohio State University, where she empowered and connected students who are interested in analytics to help them get into the industry.
Elizabeth has actually been involved with the WIA community for many years, starting off on the organizing team for the 2018 conference back when she was still a student at OSU. We are very excited to have her joining us here today.
Welcome Elizabeth, and thank you so much for joining us today.
Elizabeth Gilbert 01:05
Thank you for having me, Lauren. It's great to be be here.
Lauren Burke 01:07
To start off, could you give us a little background on yourself and tell us a little bit more about what you do in the analytics space?
Elizabeth Gilbert 01:12
Sure. So my background in analytics comes from the Data Analytics degree at Ohio State, the undergraduate major. That's really what got my start here in Columbus involved with things like the Women in Analytics Conference and the Big Data and Analytics Association at Ohio State. Those things are really relevant because what I care about, this like, getting a vocabulary of analytics, started there for me.
So that's my background. That's where I started. I have worked as a data scientist at a very small startup, and I currently work as a Product Data Analyst at a very large startup. Currently at Olive, which does healthcare AI.
Lauren Burke 01:47
Awesome. And so we were chatting yesterday and Elizabeth, you mentioned your passion for learning and defining the vocabulary of data and analytics, which is such a fascinating topic. So we had to have you on the podcast today to learn a little bit more about that. But for those of us that aren't as familiar or maybe new to the field, what do you mean when you say the vocabulary of data and analytics?
Elizabeth Gilbert 02:09
Yeah, I think for a field that has grown and changed so much and is so interdisciplinary, we have a lot of people coming from a lot of different backgrounds. The vocabulary of analytics has come from a lot of different places, it's a lot of statistics and computer science. You know, that is what data science is as a whole, with a lot of domain application.
So the vocabulary comes from those places. And what I mean, when I say the vocabulary is just how we talk about it more or less. And for somebody who's getting into the field or learning about it, whether they're a student or experienced in a domain and getting more into data science or different parts of analytics, different job functions, or things like that into leadership or anything that touches it is learning how do we talk about it? And how do you ask questions about it? Especially as someone who's trying to learn, right. The vocabulary of it helps you to demystify and feel more comfortable in this space. that's what I mean when I say that.
Lauren Burke 03:01
Right, and that's such a good thing to be focusing on. I think it's a really interesting topic to dive into as you're first getting into the data and analytics space, because you are coming from a place where I feel like you are more accepting of new things you're learning and data and analytics, as you said, it is sort of a combination of all of these fields that work with data and work with analytics.
And 10 years ago, it might not have been called a data scientist role, but now is considered a data scientist in every industry and all of those industries have their own terms. They have their own phrases you use to describe specific things. And they're all sort of related to each other in different ways, which I think it is a really interesting thing that you're trying to do here, make it easier for people to communicate when they are using those terms and maybe not seeing eye to eye, but really trying to say the same thing in a different way.
Elizabeth Gilbert 03:51
Yeah. One of the places that I saw that was when I was working at Caterpillar. I was in the middle of my data science degree as an internship, and a lot of the data scientists, in the division that I was working in, came from a mechanical engineering background. And one of the pieces of vocabulary that I found the most surprising that was different, cuz their background is that they said data mining as collecting data when really data mining means, or you can think of it as mining knowledge from data.
And so my objection was it's really important to be all on the same page about what the words we use mean so that we can all communicate in this really interdisciplinary field, across disciplines. And that's not, you know, to throw shade at the caterpillar folks, because they did fantastic data science, but it also matters how we talk about it. Especially when you're bringing in new people, me, not from a mechanical engineering background, but from a data analytics background, working with them.
And that's a pretty manageable example, but , it goes, you know, beyond that too.
Lauren Burke 04:44
That's such a good example, too, because I feel like data mining in its own is sort of considered its own role, its own field, like its own area of analytics and to describe data collection as data mining, you could easily have a lot of miscommunication that. So I love the example you brought up with that.
Elizabeth Gilbert 05:02
Clear communication in interdisciplinary fields like data science is so important, so we cannot risk that.
Lauren Burke 05:08
So what led to your interest in this topic?
Elizabeth Gilbert 05:10
Really remembering my own background and seeing my own journey through learning and getting an intro to analytics for me, I mentioned the Big Data and Analytics Association at Ohio State, the student org.
When I was a student, I knew that I wanted to get experience faster than my coursework would kind of give it to me and so that student org is something I really leaned into. And it can be kind of overwhelming to show up to a new space where people are using all kinds of words and talking about, and showing diagrams to explain and talk about things that you've never heard of before.
But exposing yourself to that and being aware of what you're learning and trying out the words for yourself and using them to ask intelligent questions and to know the boundaries of your knowledge and knowing it is something you need to dig into more and ask more about. Was something that really propelled me. And I found it really, really helpful to feel confident in this space and to keep showing up.
And that just builds upon itself. Right. You have the knowledge and experience to do projects and to get jobs and to dip your toes in the water and figure out what you like. And even directed me towards my interests in my direction for my career in analytics. In the steps I wanna take and where I wanna go. So that's why that helped a lot for me.
Lauren Burke 06:20
I like the way that you are describing this, which you're almost describing learning this vocabulary as you would learning a vocabulary of a new language you're learning. And I think it applies really well with the communication aspect you were saying, because you might be trying to describe some concept that is abstract when you are having to use sentences or even short strings of words to describe it.
But maybe there's already one specific word, like when you're saying data mining, and instead it's being described as collecting data and just storing it, but really that's not what you're truly doing with that term. So there's a better way to describe it. And once you learn that, it's not only easier for you to understand what you're doing, but it's easier for others to understand what you're trying to accomplish.
Elizabeth Gilbert 07:01
Lauren Burke 07:02
So how has growing your data science vocabulary helped you grow as a data scientist?
Elizabeth Gilbert 07:07
Yeah. I mentioned a little bit that it helped me to find direction in what I want to do in my career in analytics. Coming in and being interested in the tools of statistics and computer science and choosing this as a career path and field, because of that feels like, you know, a lot of direction, but really there's as you know, a million and one applications of analytics and data science.
Lauren Burke 07:27
Elizabeth Gilbert 07:28
So it really doesn't narrow down what industry or field you wanna work in at all. But getting that vocabulary. Specifically, one of the things that helped me to be the most aware and reflective in what I was enjoying the most in analytics and what I was doing and where I wanted to go. Is the concept, that I think is really important in getting the vocabulary of analytics, of curiosity.
You have to be really curious. You have to always ask questions. You have to never take things at face value and when something looks weird or doesn't make total sense to you. Ask why and figure it out and be curious and dig deeper. You know, whether that's the data or the story it's telling or anything involved in the process, that curiosity is really important.
And so listening to what I was curious about, in both my career and my academics and what I was learning and trying and getting in this career path and my day to day life. What kind of questions I was naturally asking about the world and the processes I interacted with, led me to realize that most of those questions were about how I personally interact with technology and how, when you think about data being able to be captured about anything, how those interactions could be captured and how that can scale.
I think one of the coolest things about data science is the scale that you can answer questions at. And so my natural curiosity was about scaling my interactions to general all users interacting with the things that I was interact with.
I'm not one, you know, I'm not an isolated case, interacting with the tools that I use. There's so many other users utilizing the exact same things. And what kind of patterns exist at that level? I am so curious.
And so that really directed me towards being really interested in quantitative user experience research and user interaction data. And that's why I'm doing product data analytics now, because I'm curious about what kind of questions we answer when we look at specifically product analytics.
Moving more specific from a general data science background.
Lauren Burke 09:08
So you were a data scientist and I'm assuming you still use those skills on the day to day. What sort of skills do you think are important for a data scientist? And do you think a good data and analytics vocabulary is necessary to be a successful data scientist?
Elizabeth Gilbert 09:24
Totally. I think that the vocabulary is something that comes with the skills and the job, right. If you're a successful data scientist, you probably are really good at talking about data science and analytics. And whether, you know, it or not, you could probably describe that vocabulary to someone else. And I think it's really empowering to those people that you can provide that to and have that conversation with whether one on one or at a larger scale to be able to do that.
So if you're a data scientist, you probably have that skill, whether you realize it or not. But if you're a student or somebody in a class learning data science, that skill is something that you're absolutely right, can totally get you there and help you move in the right direction. Like I've mentioned, get you there faster and help you to be reflective and taking good next steps in your journey in analytics.
Lauren Burke 10:07
That makes a lot of sense, because I know, especially when you're interviewing for a data scientist role, one of the big things that is sort of a focus is how good of a communicator are you. Because not only do you have to be the one finding the insights, but you have to be the one to be able to communicate them in a way that's effective and can be understood by both people that are in the data world and in the business space.
Elizabeth Gilbert 10:30
Lauren Burke 10:30
And your data science vocabulary can really improve your communication skills and just how effective you are at that.
Elizabeth Gilbert 10:37
Yeah, I think that, to the other half of the question you asked, that communication is such an important part of the process. Because if you don't communicate what the actionable insights from your analytics are, so that somebody takes action on it and does something, then your whole analysis was basically for nothing because nothing happened as a result of it.
We're not doing things just to know them, keep them to ourselves. You know, there has to be something that happens because of it. You know, whether that's in academia, knowledge is shared, that helps you further things or it's in business where you make a change in the business because of what you found, what you discovered to be true in the data.
There's other parts of this vocabulary and kind of crash course in analytics. The process of analytics being really important to be aware that, you know, some words, we use interchangeably. Data cleaning, data munging should be expected to take up maybe 80% of the time of the process. And that's totally reasonable because data is more often than not very messy and challenging to work with.
And if you don't take the time to know that that is an important, big part of the process. Know how to talk about it and be prepared to actually do it. There's gonna be things that you'll miss and important nuances in the data that won't be captured by your analysis, it'll be overlooked. And so that's one of the things that is totally a part of every analytics process and knowing that it exists knowing how to talk about it is something that is very helpful.
Lauren Burke 12:00
Yeah, I like that you bring up the project workflow.
So in your role as a product data analyst and leaning on that analytics and data vocabulary, what do you think an analytics project workflow looks like?
Elizabeth Gilbert 12:13
Totally. So the way I have liked to structure my workflows and kind of make sure I'm reflecting on hitting all the steps. Like I mentioned, the data cleaning communication, there's a lot of other parts. The framework that has been helpful for me is called CRISP DM. That stands for cross industry standard process for data mining. Like you talked about data mining, we thought of as mining knowledge from data. That's just a fancy title to say, you know, one process or workflow for analytics.
And so the steps there start with a business understanding and move to a data understanding. So in business understanding, you have to really know why ask, what is the question we're trying to answer? What is our goal? And that is as critical as communication, because if you're not clear on the question you're answering and you answer the wrong question, your analysis was also for nothing because you can't use what you created, what the answers you found because they weren't applicable and you really need to find something else. So having a clear business understanding is so important.
And then the clear data understanding. And that's where a lot of your exploratory data analysis comes in. That's a critical part of it.
And then moving on to your data preparation, data cleaning, data munging, some words used interchangeably there. Like we talked about a huge part, maybe 80% of most analytics processes.
And then onto the really fun part that everyone thinks of the modeling. That's great.
And all these parts, you know, maybe find something in your modeling that reveals something you have to clean or prepare differently. And you'll go, you know, back and forth. You know, it's not a linear step by step. You don't never go back to data cleaning as soon as you start modeling, you know, so that's your fourth step.
And then after you have a model, you have to evaluate it somehow. And for models, you probably have some output, some, you know, way to evaluate the model numerically. Sometimes you're doing analyses that are not modeling. They're not advanced, you know, AI/ machine learning. Maybe you're just doing a simple significance test or you're looking at correlation. But just because you're not modeling doesn't mean you skip data cleaning or business understanding, right?
So modeling can be thought of as whatever analytics process you have chosen as the appropriate thing to do. Maybe you're not at a modeling stage. Maybe you're at a, something that's a little bit closer to exploratory, but is something that's revealing a lot of new knowledge. It doesn't have to be the flashiest newest, deep learning thing. Just to satisfy that part of the process.
So whatever you've done as your analytics process, modeling or otherwise, you have to evaluate it. See if it answered your question and if it did great. And if it didn't, you know, you're going back to the process and iterating. And like we talked about the communication, deploying or sharing that communication. Whether you're putting a model into production live and you're going to do your ML Ops or DevOps and keep it, keep it going.
Or you did something where you're sharing it in a different way to create action. You're sharing those insights through a slide deck or a report. There's a lot of different ways that these parts of the process can look, but it's important to know what's coming and to keep all these parts in mind.
For me personally, since I've learned about this framework several years ago, I've structured my projects on my computer as like one underscore background, two underscore cleaning, three underscore analysis. Take these parts and that gives you a really clear checklist. And then you have a folder for results and a folder for your scripts. and you can structure it that way and keep it kinda tidy.
I remember the first ever project I did in college, my summer research. Everything was in one folder and I quickly learned my lesson that that was not the way to do it. So being aware of the process, you know, came a little bit later for me and take my word for it is very helpful to structure like that.
Lauren Burke 15:44
So the key takeaway from that is folders are your friends.
Elizabeth Gilbert 15:47
Lauren Burke 15:48
And so, as someone who has been through this analytics project workflow, a number of times, what is your favorite part and what is your least favorite part?
Elizabeth Gilbert 15:56
Gosh, I mean, maybe the most satisfying part of anything for me in analytics is when you overcome a tricky problem. And that could look like when the process is complete and the tricky problem of the whole project is done and you get to communicate it. And that's an easy answer because the communication is always a fantastic part. But sometimes it's in data cleaning, when you found something and you're like, oh my gosh, thank goodness I know about that and now I get to find a new, you know, like pandas function that totally solves my problem. And that's fantastic.
Lauren Burke 16:28
I love when you find a new pandas function.
Elizabeth Gilbert 16:29
Oh my gosh.
Lauren Burke 16:29
There are so many, I found a new one last week.
Elizabeth Gilbert 16:32
It's so good. It's so good. And it feels so satisfying to know that you've taken the time to discover the problem and find a good solution.
Yeah. So they can look like a couple different things, but that's definitely my favorite part.
Lauren Burke 16:43
Awesome. So what are you currently working on and what excites you about that and about your product data analyst role at the moment?
Elizabeth Gilbert 16:52
Yeah, we're currently at the stage on my team of getting a really solid set of metrics for the products that we have to really communicate full stories about them. And so it is really challenging and really cool to be a part of that work and to be doing a lot of communication. Like we talked about, getting that business understanding with a lot of different stakeholders.
I mentioned that I come from a background where I went into analytics because I love statistics and computer science. I work at a healthcare AI company and my background is not in healthcare, but one of the coolest things I think about data science going back to that is learning new domains and applications.
And so for me, like I talked about curiosity. That plays a huge part in it. I have to be totally willing to be open-minded and say, I know options of processes and solutions, but I need to understand the context. I need to have good collaborative conversations to fully understand our goals, our business understanding. And once I have that, that firm understanding I can help move us forward, but I need to be humble and curious and collaborative. So there's a lot of that being a very important part of our process right now.
But it's super cool to see the impact these products have and the, the way these products connect healthcare solutions. Replace something where a human would have to copy and paste and you know, never finish their amount of work to do. And we're able to do that better and let them do more important work and have a bigger impact in their job.
Lauren Burke 18:20
That's awesome. I love your just genuine openness to learn whatever you can, accept it if it's changing. It sounds like you are in a role where you see a lot of different things, a lot of exciting things, and you really have to have that open mindedness.
Elizabeth Gilbert 18:34
Lauren Burke 18:35
I know that's something that makes you a better data analyst, better data scientist and so I appreciate your vouching for that approach to doing these sort of problems. Having a curious outlook, I think that can only help you in your career, especially as a younger person or a new grad going into the field, or even as someone who is a newer data analyst changing a career.
Elizabeth Gilbert 18:57
Lauren Burke 18:58
So for those that are listening and are interested in learning more about data vocabulary, maybe improving theirs, what sort of resources would you recommend?
Elizabeth Gilbert 19:07
Yeah. So recently, I actually gave a webinar on a crash course in data analytics. Where this data vocabulary is the thing that I talked about in my personal webinar. I gave it collaboratively with someone who gave a lot of great context on AI, machine learning, deep learning. Some things that I haven't worked necessarily as much and he has a great background in ML and deep learning. So it was really great to hear. So you can hear both sides of that. The best place to find that is probably on LinkedIn.
Lauren Burke 19:33
We will share the link in the podcast links.
Elizabeth Gilbert 19:36
Awesome. One of the ways that I like to describe the vocabulary of data analytics. And like I just described the process. There's some great diagrams that really illustrate that well. CRISP DM, if you just Googled, it has a diagram. But that's part of the presentation as well.
And when I talked about domain application, statistics and computer science, being the three parts that come together. I really love the Venn diagram that shows the overlap of those. It's really interesting to see what part of that you've fall in and where you might wanna move towards, because the middle can be thought of as data science unicorn, and it can be helpful tool for reflections.
Like am I strong in my domain and a lot of stats and I wanna learn how to code more and get more towards that side. Like what, what of these labels might help? So the diagrams are not podcast material, because they're so visual, but they can be a really helpful tool. So I definitely encourage checking those out and looking at it.
Lauren Burke 20:22
That's awesome. So from this presentation, if people were to watch it, what are maybe two key takeaways that they could learn and why they should watch your presentation?
Elizabeth Gilbert 20:31
Hmm. So we've already mentioned a lot of them. The process of analytics. I talked through those big takeaways, the data mining, the data cleaning, the business understanding, the communication, understanding the parts of data science, data science unicorn.
Something else that we cover in the presentation that I talk about specifically, that I think is helpful, is the analytics maturity model, which Gartner put out several years ago. And as any good statistician will tell you not all models are true, but some are useful. And this is true of this one too. It kind of illustrates the trajectory progression movement from descriptive analytics to diagnostic analytics, to predictive analytics, to prescriptive analytics, right?
If you want to predict the future and prescribe an outcome, that's probably not the place you're gonna jump into if you're just starting an analytics practice in your organization or your company. And it's totally okay and actually can make a ton of value if you don't have descriptive analytics and you're starting there. So it can be helpful framework for that. But also on the flip side, maybe your solution is a prescriptive analytics solution, and that is like all the value you're adding, your whole value add.
And so in that way, it is not a one step at a time necessarily. But it can be really helpful for understanding what are some questions that I should be able to answer in what order? And just gives a framework for reflection, you know?
Lauren Burke 21:55
That's such a good point. I think it's important for people that are trying to establish a analytics or data science presence in their organization, that you not just right try and jump in and say, we need AI, we need ML. When in reality you can start and you can build up your analytics organization by adding smaller pieces and using the descriptive and maybe some of the diagnostic and you don't need to do predictive right away. There's a lot of insights you can gain from your data just by looking backward at.
Elizabeth Gilbert 22:22
Totally. Just like not everyone needs big data to have an impactful application or insight. Big data is relative to the industry. Big data in astrophysics looks way different than big data in like sociology. Your sample size is gonna be incredibly different, the amount of data you have to store. And that doesn't mean that the same size in in one area is not useful or valuable or can't have any good insights from it.
Just because something's a hot buzzword keyword doesn't mean that it's required, to have good analytics. And that's one of the takeaways of having a vocabulary of analytics. Where maybe you've heard these things a bunch and maybe you're trying to get more into them, but understand the context around them, and what are they really, and when and where do you use them and need them is really important.
Lauren Burke 23:09
That's awesome. I think that you've really summarized why it's so beneficial to have a vocabulary, why it's so important to learn the vocabulary and why it's important to continue keeping up with that and updating it as things change, as your industry, as your role evolves. So I really love everything you've shared with us today.
Before we close out, I'd like to ask, can you recommend to the listeners a resource that has helped you in your analytics career and that you feel would help others?
Elizabeth Gilbert 23:37
Totally. The resource I'm gonna recommend is going to be in-person interaction and community. Not only for people who are learning the field, transitioning into it, getting connected. But also like you mentioned for keeping up with and learning more about and staying connected in the field and keeping up to date with things that are new and changing.
I think that it's possible to learn everything you need to about analytics just by possibly reading and learning, maybe doing projects on your own and isolated. But especially with being able to talk about analytics, having the community to have conversations with. And maybe you go in person to a local meetup where someone's presenting and you ask the presenter more about the presentation or share something or pose a question. One of my favorite questions, if you don't have one, but you wanna connect, is what is one thing you didn't include in your presentation that you could tell me about the topic? So feel free to steal that and go meet in person and jump into it.
No matter how unqualified to be at the space you feel, or how new, or how curious you are. Just show up and over time with consistency, the vocabulary and the confidence and the knowledge and skills and direction will come. I think that's a really important part of it. Local meetups aren't the extent of it. I would totally recommend local, national conferences. The DataConnect conference is a great place to come for your vocabulary. I'm biased because I've been here plenty. And it's one of my favorite places to keep up to date on the vocabulary of analytics and learn new things today.
For example, at this session that I was just at - we're recording this at the conference - I just learned a little bit more about what data governance is. And as a data analyst, a data scientist, it is important for me to be aware of that. And it's been a word that I've heard, but I didn't know very much about. So I'm really glad that I got to talk to the person sharing those thoughts. And now I have way more to think about and a little bit more to even my vocabulary.
So you're really never done learning and everybody can benefit from in person connection and community. And thank goodness we can do a little bit more of that now, as opposed to when we were limited to virtual. But, yeah, you got it. You can do it. I believe in you with learning analytics.
Lauren Burke 25:55
That's such a good outlook. I love everything you just said. The question you suggest posing, that is such a good idea because no matter what you're presenting on, there's always more you can add. And there's always something you had to cut out that you really wanted to add. And so that's, sometimes that's something that someone is really enthusiastic about, they just didn't have time or didn't have space to put it. So I think that is such a great suggestion.
So, where can our listeners find you on the internet if they want to keep up with you or they want to connect?
Elizabeth Gilbert 26:23
Totally. I would say LinkedIn is the best place. My name is Elizabeth Gilbert, but there are many Elizabeth Gilberts out there. So look up Olive, if you want with my name.
Lauren Burke 26:32
We'll include a link to Elizabeth's LinkedIn, and you can keep up with her, follow along with her work, and when she presents more on the data science and analytics vocabulary, you can continue your learning of the topic.
Elizabeth Gilbert 26:45
Yeah, and if you want to learn more by conversation or get connected in the Columbus or Cincinnati areas, I'm aware of the analytical communities there. I'd be happy to connect you individually.
I think for me, the way that I developed this, my thoughts around the vocabulary of analytics, the crash course in analytics, learning it, is by individual conversations. And I know it can be really intimidating, but take the encouragement I've given as my confidence in you learning, analytics listener, but also know that you can reach out to me or plenty of other people in analytics community. And you know, like Lauren said, the speakers, people that you're sharing with the same space as, are enthusiastic about sharing their thoughts and experiences and helping you have your own wonderful impact in analytics.
Lauren Burke 27:27
Well, thank you so much. Thank you so much for being here. Thank you so much for sharing more and teaching us more about having a great data science vocabulary. It was such a fun conversation. I really appreciate you joining us.
Elizabeth Gilbert 27:38
Thanks so much for having me, Lauren.