Episode 110 | February 17, 2021

Edouard Mathieu: An Open Data Approach to Solving the World’s Problems

Edouard Mathieu, from Our World in Data, says data can be a force for change as we address problems ranging from COVID to climate change.

Play

Listen Now

Guest

Edouard Mathieu

Head of Data, Our World In Data

Play

Listen Now

Highlights

Our World in Data was founded on the premise that data can tell stories about what is good or bad in our world and how we can make things better.

The non-profit is committed to providing trustworthy information that can be shared and enhanced to make it even more useful in solving problems.

Much of the data is aligned around the United Nations Sustainable Development Goals (SDGs) which outlined critical goals for humanity to address by 2030.

Global data reveals a mixed bag in terms of the direction of our world, with some areas showing promise and others in decline.

Episode Links

Our World In Data

Data from Our World In Data

https://github.com/edomt

OWID’s Github Page

United Nations Department of Economic and Social Affairs: Sustainable Developme…

Transcript

IVAN STEGIC: Hey Everyone! You’re listening to the TEN7 Podcast, where we get together every fortnight, and sometimes more often, to talk about technology, business, and the humans in it. I’m your host Ivan Stegic. My guest today is Edouard Mathieu, who is the head of data at Our World in Data, a non-profit which is about research and data to make progress against the world’s largest problems. Ed works on the whole chain of collection, transformation, documentation and dissemination of data at OWID. And today we’re going to spend some time talking about the work they’re doing globally, and why it’s so important, especially right now.

Hello and welcome, it’s so nice to have you on the show Ed.

EDOUARD MATHIEU: Hello Ivan, and, yeah, thanks for inviting me. It’s very nice to be here.

IVAN: I’m so glad I saw you on Twitter and tweeting about all of the wonderful data that you guys have made available.

EDOUARD: Yeah, it’s been a pretty intense period for me on Twitter, a lot of tweeting, lots of new followers as well, and I’m glad you were among them and I’m glad we met.

IVAN: Yes, me too. So, where are you joining us from today?

EDOUARD: I’m joining you from Paris where I live. The Our World In Data team is actually split between a few countries and even a few continents. Some of us are in Europe, especially in the UK. But I’m in France, and we’ve got a few people with you in the U.S.

IVAN: You sound like you are a Frenchman with an English accent. Your English is excellent, but it sounds like you could’ve been an Englishman as well. Have you spent time in England? I would imagine you have with the team being partly there.

EDOUARD: Yes, exactly. So I lived three years in Oxford which is when I met some of the people who founded Our World In Data. I was there to work for the university. And thank you, I just kind of picked up some of the accent. Before that I grew up watching American TV shows, so I used to have a much more American accent, and now I have a weird blend of both, I think.

IVAN: I grew up in South Africa, so I used to have a South African accent. Watched TV all the time, American TV, and then when I immigrated to the United States, my accents changed, and I don’t think anyone who knew me in elementary school would recognize the way I speak right now. [laughing]

EDOUARD: Yeah, I’m not surprised. It’s a weird thing and some people end up thinking I’m Canadian somehow, because apparently, it’s the right mix. But, yeah, it’s just a bit confusing for people, and some of them do expect a thicker French accent then they listen to me.

IVAN: Let’s talk a little bit about your work and what you did before you joined Our World In Data, and what your role there is. So, you studied in Paris, I think. Tell me a little bit about what you studied and what you were doing before you joined the team.

EDOUARD: What I studied initially had little to do with what I do now. I had a few bumps in the road, let’s say, so I initially studied Social Science in a university in Paris called Sciences Po, which in French means political science. And there I studied a mix of economics, philosophy, history, sociology and political science of course. After this I actually had a degree in digital marketing, which is very different from what I do now.

So I worked in that field for a few years, and ended up liking some aspects of it and really not liking some other aspects of it. Namely, I didn’t really like the marketing aspect too much and the PR aspect, but I did realized that what I did like was crunching numbers, and whenever we had to produce a dashboard for clients to report on the performance of something, and whenever I had to open a spreadsheet, me and my colleagues actually had a good time, and I liked that. That made me realize maybe I could put away the PR aspect and just focus on the data, and that’s what I did.

It was the golden age of online courses, so I took a bunch of them on Coursera. That was really the time where it was all free and wonderful. So, I did all of that, and then I moved to Oxford for personal reasons to get there with someone. And I applied for a job at the university, and I was very fortunate to get that job. And then I spent three years there working as a data scientist with the University with two different public health departments, which gave me a lot of the knowledge I was lacking. But also some kind of legitimacy given the fact that I hadn’t sort of properly studied that.

Even though at the time there wasn’t really anything like a data science degree, there is now. But in 2012, 2014 there wasn’t any such thing. But that gave me the legitimacy to be able to do that without people questioning too much the fact that I had originally studied many different things.

After that I went back to Paris, worked for a couple of years as a data science consultant doing data science, data engineering, machine learning, deep learning, things like this. And after that the pandemic happened unfortunately, and I had an opportunity to join the people at Our World In Data to do what I do now.

IVAN: And you mentioned you went to Oxford and that’s how you met a ton of the people that were a part of Our World In Data. Max Roser founded OWID ten years ago and he was basically doing all of this on his own. But now it’s a much larger organization than just one person. There’s Y Combinator involvement, you guys were part of a winter batch in 2019. Can you describe the process that ended up with you working at OWID? How did that look? What happened?

EDOUARD So it was very gradual both for me and the organization as a whole. It started with Max’s project. Max is an economist, historian, I’m not exactly sure how he would define himself. He knows a lot of very different topics. And he started Our World In Data in 2011/2012 very much as a personal project, almost like a blog. He had this idea of using data to produce graphs, charts, maps that would tell people a better story about the world, how it came to be this way, how things are getting better, which things are still not good at all, and how the world is evolving. It started as a side project for him, and then it sort of caught some traction and got a lot of success.

He started getting funding for it from the University, and he was able to hire one person, then more people, then more people and now here we are. As you said, there’s been a lot of growth in the project. We are now a team of about 10 to 15 people split between researchers and people looking at the data like myself, and developers who are working on the site and on the charts.

And for me it was a very gradual involvement. I met Max during my time at Oxford, just as a friend, and we talked about his project very often. But at the time, I couldn’t see any way to get completely involved. And the in March 2020, I started getting a lot of tweets from him about obviously the pandemic, and the fact that the website started publishing some data about it.

At the time it was March 2020, I was living alone under lockdown, so I just told him, Well, if you need any help, I’m available, and it looks like in the next few weeks and months I’m going to have a lot of time on my hands. So he said yes, and I just got involved initially to help them. It was the start of the testing data set, where they wanted to gather all the global data on PC office at the time, so I got involved doing that with them. And after a few weeks it went really well, and they just offered me a full-time position which I accepted gladly. And for a couple of months I actually had to juggle between my previous job as a consultant and that job. So, it’s a big old kind of thing that’s only possible when you’re under lockdown, doing two jobs at the same time. Then since summer 2020, I’m glad to only be doing one full-time job with them.

IVAN: It sounds like your title infers that you are responsible basically for all the data, and how it’s accumulated, and how it’s transformed, and how it’s documented and how it’s published. Is that a fair description of your job role?

EDOUARD: Exactly. Yes. Before I joined there was actually no such thing as a data person at OWID, which kind of seemed a little weird. But the team was really split between researchers and developers, and there was really no bridge between them or somebody whose job was clearly to deal with the data. Now, that’s my role, so I’m really in charge of getting the data into Our World In Data into the database, into the charts, and making sure that the data is correct, that it’s up to date, that it’s documented. And possibly gathering more data whenever we think it’s relevant, which has been the case recently for COVID vaccinations. And yes, I’m really the person who is responsible for making sure that the data part of Our World In Data is as good as it should be.

IVAN: What’s your philosophy around where the data is stored, how it’s standardized and where it’s published?

EDOUARD: We try to make it as transparent and public as possible. In the last year this has actually changed a lot in the way we work. Before 2020, we relied a lot on a very classic SQL that’s a base, and we still do in some ways. But in the last few months we’ve really started to move over to a systems that involves all of our data being available on GitHub, and directly using it as it is on GitHub to feed the data into our charts. We’ve done this for several reasons.

The first one is, we want people to be able to know exactly where the data comes from, what we did to the data, how we changed it and how it came to be the way it is. For other topics it might not sound as sensitive as it is for COVID, but for COVID it’s really been an issue for some other institutions, the fact that they publish something and people just really start questioning, because maybe they don’t like the results they’re seeing, they start questioning the data itself, and how it came to be and where is the code and how did you get those results.

For us, we really wanted to avoid this, and so all of our data and the code behind it is available on GitHub, which also provides us with a versioning system so people can not only get the data and get the code, but also check how they looked like a week ago, a month ago, and make sure that we don’t cheat the numbers or anything like this.

Beyond this, our philosophy is to make everything available, so none of the things we publish on Our World In Data, whether they’re the articles, the charts, the data, the code, none of it is under classic copyright. We have a CC (Creative Commons) by license which means that people can reuse, readapt, republish anything we do. The only thing we ask for them to do is to credit us by saying that it comes from Our World In Data. But beyond this, it’s very much our goal that people reuse everything we do. We’re not a newspaper. We’re not a media. So, we’re not aiming at just publishing things that people just either use it on a website or that’s it. We want people to share everything we do, enhance it, reuse it and make it as useful as possible for the world.

IVAN: One of the things I really like about the data that you’re publishing is that there is always a source cited. And it’s mostly a government entity that is publishing that data as part of some program, as part of some law, it’s usually open data as well. Are you collecting data that there are other licenses for? Is there ever any pushback at the data that you’re publishing from the original sources, or is that pretty much all squared up and lined up by the fact that people are interested in publishing the data in an open fashion?

EDOUARD: There is rarely any pushback after publishing, because I don’t think we’ve ever published something without the consent or the legal approval of the source. However there can be, not some pushback but some difficulties with getting the data in the first place. Just because some of our researchers have been working on some topics for years and years, they know that some data is available in the world about topic X. And they know that it belongs to a company for example, or that it’s stored in a database somewhere, but that it’s impossible currently to access it. It can be especially the case in things like energy, where some of the big petrol companies in the world have a wealth of data that we know them to have, but that is very difficult to access because it represents things that can be confidential, and things that could be like an economic resource for them so they’re not super used to making it public.

So, we’ve had some time discussion with these groups, with these institutions, with these companies, to try to get them to make the data available. Once this is done usually, they are really happy with the result, also because it’s really not our style to change the numbers or to do anything weird with the data.

So we usually just take the data, we make it into a form that we think is much more understandable for the public, or in terms of the data sets we make it into a form that is reusable by people in a machine readable way. The sources we use are either already making the data public, or we get them to do it and then they’re really happy they’ve done it.

IVAN: I love this idea that the data is machine readable and also available on GitHub, and that the URLs are the same, so that you can get the latest vaccination data for the United States, for example, by simply pinging the same URL once a day. There’s nothing that changes on it except the data. The URL doesn’t change. That’s a wonderful asset that you have, and I hope that it continues. I hope that you continue to evolve the data sets and that this continues to be the way you publish data.

EDOUARD: I think it will be the case because we’re actually in an unusual position in that we’re both consumers and producers of data. We consume the data in that, for example, in the last month I’ve collected data on COVID-19 vaccinations from dozens and dozens of websites. I collect that, and I have to write scripts that collect the data as automatically as possible so that I don’t spend everyday going there and collecting the data. At the same time what I do with this data is that I clean it, I rearrange it, I make it into data sets that I then publish. And so, because I’m also consumer of data, I know exactly what people wish I do and what I don’t do.

So, for example, yes, I’m very aware of the fact that stable URLs are a thing, and whenever I’m going on the website and actually, they change the URL every day, it’s a bit of a pain and I have to write a script that finds the new URL and instead of just using the same one. Same thing for formats for changes and structure in the name of columns. So, in the same way the data we publish on GitHub, we’ve had to change a few things a few times but most of the time we are very careful not to change the format, not to change the name of columns. Or we make people aware of the change with a lot of time, so they can adapt their script and it doesn’t break anything.

IVAN: Where do you anticipate that going in the future? Do you anticipate it ever being a queryable API that the data set that you have as an organization becomes so valuable that it’s not just sufficient to have a CSV file in GitHub. Which, don’t get me wrong it’s amazing, but that maybe there’s a use case where there’s an API that also makes this data available?

EDOUARD: That is entirely true, and to be honest and transparent that is clearly on our roadmap. The thing is it’s been a bit of a difficulty to get a stable roadmap. Many times during the day we thought, Hey it looks like this thing is calming down. Maybe we can try to do this long-term project that we talked about a few months ago. And then every time a new thing happens, or we have to create a new data set related to COVID.

But when things do calm down, whenever it happens, creating an API and letting our users query the data on a variable level, or on a country level, or on a year or date level is very much something we want to do. We think that for a lot of users the CSV format is a very good one that can be for media organizations, researchers, students, whoever wants to look at the data and feels more comfortable with a tabular format, with a spreadsheet or something like that. But also, we know that a lot of people who want to use our data would be much more comfortable using an API and just querying what they need, instead of getting a whole database.

Also because obviously with everything that’s happened in the last year for example, what we started at a very small data set for COVID has grown into a CSV file which is 14 megabytes. And 14 megabytes isn’t huge, but if somebody’s building an app that fetches the data everyday, that can be a bit of a problem. So definitely for a lot of things that we build whether they are COVID related or not, we do aim to make an API available in the future whenever that’s possible.

IVAN: We’ve talked a lot about the COVID, the fact that you are tracking COVID data, but you’re tracking a great number of other data sets as well. And, by great number I mean many hundreds. It’s interesting to me that you’re both the publisher and the consumer of data. So, when I look at your website, and I would love to give this URL out, www.ourworldindata.org/charts, that’s basically the result of all of the consumption of your own data and the knowledge from your researchers that’s gone into publishing this information. If you go to that /charts URL it is a very long page of very many data sets as I said earlier.

There’s everything from tetanus shot death rate, to child mortality, to water use and of course for our purposes now, Coronavirus data. Can you give our listeners a sense of the breadth of the data that you have? And sure, Coronavirus is a focus now, but what are the other things that are a part of your focus as an organization?

EDOUARD: It’s really something that as you said, it’s grown into a pretty huge database of data, of charts and of articles in the last 10 years. We are interested in pretty much anything that we think is a big problem for the world, and of course that’s a lot of things at the moment. In a broader picture there have been many problems in the past, that there are still a lot of in the present, and there will be more problems in the future.

And that can cover things like demographic change and economic inequality, and food and agriculture, the environment, climate change of course. And so, we have a great number of charts, and as you said at that URL people can find the chart themselves which is something that a lot of people want to find when they get on our website. They want to find mostly a chart and we know that a lot of people in our audience are very data savvy, they like numbers, they can reach out easily, much more than the average person in the population probably.

So, some people are just there to consume the charts, and so we make them available with the clearest possible title, subtitle, annotation and as up to date as possible so that people can consume them this way. On top of this, we also know that some people come to our site for guidance as to how they should discover or understand a topic. So, some people come to the site, and they want to learn about climate change, and they haven’t really looked into the question too much in the past, and so if you gave them that list of charts, they wouldn’t really know what to look for. So, we don’t just give them charts, we also give them pages where they can just go to the page about climate change and we try to tell them the story about what we think they should know about climate change today, in the past and in the future.

And what other relevant data points to know, what are the trends that they should know about so that they can discover the story, understand the data and also make their own opinion about the issue. On top of helping them understand the topic, we also try to make it more obvious which possible solutions there are to these issues. We don’t just talk about the big problems, we also try to highlight the possible solutions to these problems.

And of course for some problems like climate change we have some very in-depth articles about the difference emissions that are caused by different things in society, or the different types of emissions, different amounts of emissions caused by different kinds of food or different types of protein, so that people can directly link the issue we talk about with potential solutions and potential ways that the world could be different so that those problems get better.

IVAN: You’re tackling really large global problems and that is admirable as an organization. It is scary to think [laughing] about all these things. as a human that is inhabiting this planet. I’m glad that your non-profit is attempting this. The sustainable development growth tracker it lists a number of major problems that we’d love for you to talk about how your organization's mission is aligned with this tracker, and how it’s related to the U.N.

EDOUARD: What we call the SDG, the sustainable development goals are a list of goals that were drafted and signed by the United Nations members in 2015, and they were goals for humanity for the span on the 15 next years. So, supposedly by the year 2030, we want to achieve a certain number of goals in different areas. And as with any kind of goal in life, whether it’s big goals like this, or just personal goals with the new year or company goals, we all know that one of the big things about it is first of all defining the goals precisely, not just being very abstract about what you want to do. But getting precise thresholds and numbers about where you want to get. And also tracking your progress. If you spend the next 15 years just wishing that things will get better and then checking after 15 years where you are most likely you will realize that you failed, because you haven’t taken the proper steps to accelerate the base of what needed to be accelerated.

So the idea of the SDG tracker is to track those goals and to make sure that we know where humanity is at on those very large goals. So to give you a few examples, there are goals about poverty, about education, about gender equality, about democracy, climate of course, and many things like this. And the idea is that on this website which is SDG-tracker.org and that people can find easily on www.ourworldindata.org. People can click on any kind of goal like this, and what they get is the definition of the goal as stated by the SDG, by the United Nations.

And then next to it usually a chart, which is a chart taken from that long list of charts that you mentioned, and that lets people check a world map usually of where things are at the moment, everywhere in the world and for the latest year available and to realize whether we are on track or not to reach that goal.

IVAN: How are we doing? There’s only about 9 years left right? You said 2030 right. How are we doing with these goals?

EDOUARD: I think that for some of them we are on speed for the target. For some of them it looks like it’s going to be much more difficult. Obviously, some of the difficult ones will be some of the climate ones, because unfortunately the world still isn’t quite doing what should be done to do this better. For some other problems the SDG’s were more aligned with what were pre-existing trends. It is also trends that we want to highlight on Our World In Data. For example the fact that as time goes on, many things are actually going better if you take a long enough timeline. Things like 50 years, 100 years, 200 years, the amount of education getting better, and fewer wars, and fewer deaths, and less infant mortality, more gender equality and more free countries with free elections.

Things like this where the sustainable development goals are really trying to find a way to say that this should be getting even better. But there are many ways in which the world is already getting better, and in which fortunately despite some problems obviously in some countries on things like democracy, overall the big picture is that the world is getting better and better, and that some of these goals will thankfully be achieved.

IVAN: That reminds me of a tweet that Max Roser has that he pinned on his profile. I think it’s from the middle of last year. There are three statements and all three can be true at the same time. The world is much better. The world is awful. The world can be much better. I just love the way that’s said, because it kind of encapsulates what you were just describing about the trends and depending on how many years you look at certain trends.

EDOUARD: Exactly, and those three statements are extremely important to us. They kind of summarize really well in a few words the entire philosophy behind our work. If you take each statement individually it can seem a bit extreme and a bit exaggerated. And actually, we think that the people who only believe in one of these statements tend to be people who don’t think in a very balanced way. So, some people think that the world is just getting better whatever happens and that we shouldn’t worry too much about the things that we think are bad. We obviously think that’s not true. Some of the things in the world are still very preoccupying.

However, we also think that some people tend to think that everything is just awful about the world, and those people don’t realize what I just mentioned before, the fact that if you zoom out geographically about the entire world, and if you zoom out in time, you need to realize that there are very long-term trends by which the world is definitely getting better.

And failing to recognize this is also being too caught up in the present moment and too caught up in current affairs and in the news and failing to realize that things can be getting better.

That brings us to the last part, yes, the world can be much better. That means that despite the world being awful and despite the world being a little bit better than before, the world can still be so much better. And for that to happen we need to worry about the big problems and find solutions to that. And that’s also why as I said earlier we don’t just talk about the problems but we also try to talk about the solutions, so that we can not just stop at the present moment but also look forward to the future and try to make the world an even better place in the next few years, in the next decades, in the next centuries.

IVAN: I love that philosophy and that positive attitude and balanced view. I think it’s very important for anything, and I’m so glad that your organization is thinking about it and that you are approaching it the way you are. I wanted to circle back to the idea that you are trying to be as transparent as possible, that you are trying to publish all the data on the web so that other people can use it and you be the consumers as well.

I have to commend that. It’s completely consistent with how TEN7 functions, how we believe people should be functioning in the world. I wanted to ask about the user experience of the charts themselves and presenting data in a way that’s not just educational and informative, but beautiful as well. I look at your charts and I think to myself, Wow, these guys have really put a lot of thought and effort into how the data’s going to be displayed. And they’re not using bad Microsoft Excel graphs to publish important things. It looks like they’re using tools that are sophisticated and beautiful. And I wanted to ask about the tools themselves and your focus on making the data look nice. What’s the toolset that you’re using to show the data, to do the charting?

EDOUARD: The tool we use is what we call the Grapher, very simply. It is something that is completely written by our team. It’s also something we make available online so we have special GitHub reports where people can look at the code behind the Grapher, contribute to the Grapher. We actually have somebody who over the last few years contributed occasionally to the GitHub project and ended up joining our team last year, because his contribution was so great, and it is for that that we offered him an opportunity to join the team and it’s been great since. So that Grapher is as you said something that lets us represent the data in mainly three ways for now, but we hope to make many more available in the future which are, line charts, bar charts and maps.

To some people that can seem like a very short list of ways. But actually most of the problems in the world and most of the data in the world can be represented by these three things. Lines will be more useful for anything like over time. Bars to compare things for given time compare things or countries together, and maps obviously to represent geographical differences. We find that for 99% of the things we want to represent and talk about and explain to people these three formats are enough. So, we have built, and our developers have built this tool that is entirely custom and that let’s us upload data represented over time. But also more importantly, let’s our users customize the output of the graphs.

A lot of what I think people like about our website is the fact that for any kind of data, any kind of topic, they can select a list of countries including most of the time their country because that’s usually where they’re most interested in, compare with other countries or with an entire continent or compare it over time and produce that chart that is really customized to what they needed to know about. And then export it to a format that is very easily shared with other people either privately or on social media or even in a presentation.

We know that there are a lot of teachers using our data, which feels extremely nice and extremely useful, and we know that especially universities, but also in middle school and high school. Some teachers tell us that they really like the way that our data can be customized so they can represent a problem with just the right level of complexity. For some problems you want very detailed data, but for some other problems you just need the broad strokes so that people get the general picture.

Obviously, I’m not going to lie, it also has its downside, and we’ve seen a few of them with COVID. The custom aspects of the graphs also means that countries or some people can cherry pick the data, and we’ve seen a fair share of governments who use our data and our charts to select just the right list of countries that will make them look very good.

Or, opposition, like the people from the opposition using just the right list of countries so that the government looks really bad. But that’s part of the game in a way, and we want people to do as much as they want with the data. Of course we think people should use it responsibly, and people should use it in a way that is accurate and represents the truth. One of our goals is really to let people explore the data themselves, so that they can not just believe what we say, but also think for themselves and discover trends on their own that maybe we hadn’t previously identified.

IVAN: So trust but verify.

EDOUARD: Exactly. And that’s also the philosophy behind the sources we’re using. We try to be very careful about any new data that we put on the website. We don’t rush into putting any kind of new data set just because it’s new and people talk about it and it looks fancy. We make sure that the data behind it is accurate, that the people who are behind the data can be trusted, that their method is publicly available, so that we can know how they calculated something.

Recently a good example of this excess mortality for COVID-19, which is a very politicized subject, because some people use it one way or another to prove that some restrictions work or that COVID is really bad or that COVID is not that bad. And the fact that we put this data online based on works from other groups, meant that we really had to understand how they came to the numbers that we’re showing. How, what exactly are their calculation methods, so that we wouldn’t just put the data out there because that’s not just what we do, but we also tell people that they can trust this data. And that in the dozens of charts that they see about mortality, this one is the one we think has the best method, and that they can trust the data they see.

IVAN: They can download the data themselves and verify it for themselves as well which is wonderful.

EDOUARD: Exactly, and that’s also something that’s really useful for us. Again, with COVID-19, some people inevitably are suspicious of the data that’s out there, rightfully so sometimes. And so we inevitably get people asking questions about what we do and how we come to publish what we publish. The fact that in that context we publish everything on GitHub is actually not just about openness and making things available at stable URLs, it’s much safer for us because anytime somebody says something like, Oh, you’re cheating the numbers. How do you calculate this? This is fake. Well we can just point them to the code, or to the data or to the history of the data. And we can tell them, Well, if you think that this calculation is not right, here’s how it works currently. If you want to suggest ways that we could improve it, we welcome feedback, but in a way that feels much safer for us because we have millions of eyes looking at the data. We have thousands of eyes looking at the code, and so we know that most likely the way we calculate things is very likely to be the right one.

IVAN: Also the data on GitHub gives you an audit trail as well. Not only is it open and you can see the versions, you can see the versions, you can see how the changes were made. And you can see the reasons why they were made, and you could see the pull request that someone may have provided to improve the calculation or to change a calculation. And so that builds confidence in the data as well. I think it’s a very smart way of approaching the vast amount of data, and quite honestly, the huge responsibility that your organization has.

EDOUARD: Yeah, and it’s a very positive side effect of something like GitHub. Originally, GitHub was not really like any kind of code versioning system, was not really meant for that. It was mostly meant for versioning the code and being able to correct changes, then track changes and helping developers work together so that code could be more easily written. And it was really created with the sense of transparency and accountability. But we find that as you said, this is really, really good for accountability and transparency. Anytime somebody says, You’ve changed this data. It didn’t look like this yesterday. We can just point them to the previous commit and tell them well, We’re not hiding anything. Look at the data. If you think something has changed it’s because we also made this change in the code, and here is the reason why we made it.

And people, as you said, can track the poll request, it can track the code, and they can also look at the issues. Especially with the vaccination data lately, we’ve had a ton of discussions on GitHub with other people who suggest changes to the data. All of these discussions are public, and we’ve actually been surprised to see a few people getting involved in discussions from ministries of health of some governments offering ways that we can get the data more easily.

IVAN: Wonderful.

EDOUARD: A month ago we had somebody from Saudi Arabia, from the government, just chiming in on discussion and saying, Hey, do you want some easier way to get the data? They got in touch with us by email after tha. And we said, Sure, let’s do that, and they created an API endpoint just for us.

But it’s public because everything we take is public. So, now there’s an API endpoint for anyone, but we use it to automate the data collection. And it felt really nice to have just like this very natural chain of events in a way that there’s discussion online. And they get involved, and they offer a way to make data available even though they are from a government, which is usually notoriously you know people from government are rarely available online on GitHub to share data. It’s really not the culture usually, and so it felt really nice to have this work this way.

IVAN: Well, congratulations. I think you are doing a wonderful job with the organization. I love the philosophy behind it, and that you’re making the data available in a standardized form. It’s really quite wonderful. And, congrats on the Grapher as well. I had no idea that was a product of your own. It’s just gorgeous and beautiful, and I love the little play button, and you can see how data changes over time. Congrats on that as well.

EDOUARD: Thank you very much. Thank you.

IVAN: Thank you also so much for spending your time with me today. It’s just been a wonder talking to you and I hope you’ll come back again soon.

EDOUARD: I will happily do so. Thank you very much for inviting me.

IVAN: You’re welcome. Edouard Mathieu, who is the Head of Data at Our World In Data, a non-profit which is about research and data to make progress against the world’s largest problems. You can find them online at www.ourworldindata.org and you can find Ed on Twitter and GitHub. Check out our show page for those details.

You’ve been listening to The TEN7 Podcast. Find us online at ten7.com/podcast. And if you have a second, do send us a message. We love hearing from you. Our email address is [email protected]. Until next time, this is Ivan Stegic. Thanks for listening.

Credits

This is Episode 110 of The TEN7 Podcast. It was recorded on February 5, 2021 and first published on February 17, 2021. Podcast length is xx minutes. Transcription by Roxanne Chumucas. Summary, highlights and editing by Brian Lucas. Music by Lexfunk. Produced by Jonathan Freed.

Please rate our podcast! Doing so helps spread the word about the show. Just pull it up in the Podcasts app and scroll all the way down, hit the stars and you're done! Thank you.

Edouard Mathieu: An Open Data Approach to Solving the World’s Problems

Highlights

Episode Links

Transcript

Credits

Related Episodes

Coleman Rollins: Creating a Startup to Help the World Breathe Easier

Gavin King: Becoming DJ Aphrodite - Using Technology to Make the World Dance

Subscribe and listen wherever you get your podcasts.