Tech in EdTech

AI, History, and the Fight for Source Truth

Magic EdTech Season 1 Episode 81

AI makes it dangerously easy to stop at one confident answer. For education publishers and learning platforms, that’s a product risk: teachers need sources they can trust, students need research support (not shortcuts), and credibility doesn’t survive black-box answers.

In this episode, Jan van der Crabben of World History Encyclopedia explains how they built the platform on one core idea: history is a web of connections, not isolated timelines. 

Jan also shares how their History AI evolved through teacher feedback, why shorter answers and visible citations matter, how academic journals help reduce hallucinations when proprietary content is limited, and why trustworthy publishing is at risk as AI becomes the default interface to knowledge.

00:00:01.88

Eric Stano

This is Tech in EdTech. I'm Eric Stano. I'm excited for today's conversation, which will mine the intersection of history, culture, and artificial intelligence. I'm joined today by Jan van der Crabben, the CEO and founder of the World History Foundation. Welcome, Jan.


00:00:24.70

Jan van der Crabben

Hi, thank you for having me. It's a pleasure to be here.


00:00:27.48

Eric Stano

Well, I'm so pleased to have this conversation with you and for you taking the time. So why don't we begin by talking about you, if you could tell us a little bit about your journey and what led you from a background in journalism and video games, which themselves seem diverse, to be occupying the same biography. But what led you from the background in these areas to found the World History Foundation and the Encyclopedia?


00:00:54.65

Jan van der Crabben

Yes, sure. So, yes, as you mentioned, I did study journalism and actually war studies, which is also another interesting topic at university. That was my master's. 


00:01:04.63

Eric Stano

Fascinating.


00:01:05.40

Jan van der Crabben

And my master's degree was actually about war in virtual worlds, as in massively multiplayer online games at the time. This is a long time ago, so that was all new at the time, you know. And so it was an interesting master's thesis.


00:01:18.78

Eric Stano

Right, right.


00:01:20.81

Jan van der Crabben

I had also dabbled in sort of video games, modifications, and so on. During my university years, I was involved in the testing of some games, particularly the Civilization franchise, with the developers of that game, due to that work. And so after my master's thesis, I actually wanted to do a PhD. I had started a PhD that was about, again, online gaming culture and things like that. I wasn't quite sure exactly what it was going to be. I did write some proposal, but I don't remember what it was. Because I didn't finish the PhD, right? So I thought, OK, maybe I should get a job. And I applied for a few jobs in game design specifically. And I got a job with that background with the Creative Assembly, a part of Sega, who make the Total War, a series of historical strategy games. And so, yeah, I worked on that for several years and made, well, I had a long career in game design, actually. But... that's not the topic of this conversation. But during that career, especially when I started, so that was in like 2006 to 2009, that period, I did a lot of historical research because the games that we made were supposed to be historically accurate. And what I found is that there is essentially something missing on the internet. There was something missing that was free, that was reliable, and that was sort of aimed at the general public. So, of course, we already had Wikipedia at the time, which is fantastic, but not always a hundred percent reliable.


00:03:00.82

Eric Stano

Right.


00:03:04.76

Jan van der Crabben

It's gotten a lot better. Then, you know, we had something like Britannica, for example, which is a fantastic publication, but it's not free.


00:03:06.94

Eric Stano

Certainly.


00:03:13.69

Jan van der Crabben

And then we had academic websites and journals, which were either, you know, not really aimed at to the general public, hard to understand, very dry and in depth, you know, or extremely expensively hidden behind paywalls that only institutions could afford, or both, in fact.


00:03:32.64

Eric Stano

Right.


00:03:32.70

Jan van der Crabben

And so it's within this triangle that I situated World History Encyclopedia, which you know was just a hobby project to start with. I wanted to improve my programming, you know, get back into web design, which I did when I was younger. And so, I started just programming this website sort of for fun on my commuter train journey on a little netbook. And I launched it in 2009. Back then, it was called Ancient History Encyclopedia, which, yeah, was just initially a little hobby project.


00:04:10.85

Eric Stano

Very nice, very nice. And I can now see the through line a little bit more. These weren't historical shoot-em-up games that you were working on, but ones that were deeply rooted in history. So I kind of get a sense of your journey now. And you've begun to unpack this a little bit, but I'd love to understand. I know you were trying to, sort of, combat is not the word I'm looking for, but create a space where it was free, it wasn't dry, it was available, and consolidating history for interested users, students, etc. But tell me, go a little bit deeper if you could into the problems you were trying to solve by digitizing history and cultural knowledge.


00:05:05.02

Jan van der Crabben

Yeah, I mean, so I think the primary problem that I was trying to address was access.


00:05:12.30

Eric Stano

Sure.


00:05:12.89

Jan van der Crabben

Like I mentioned to me, you know, there's this idea of open access, particularly in academic publishing, and that's all about you know, the content being available for free. But I think that that's only like the first step, really. If something wants to be accessible, it needs to be also easy to read, it needs to be interesting to read, and it needs to be well presented. So that's where World History Encyclopedia essentially, that's where my journalistic background came in as well, right? Because I was obviously the first editor and also writer of World History Encyclopedia, even though the writing was not my thing. So the idea was really to say, well, you know, this needs to be a publication that sort of highlights the interest of you know, history that makes it engaging and not dry and boring and just dates and facts. In fact, I like to always say that the story is the key part of the word ‘history’. And so, that's kind of the motto that guides all the articles that we publish on the website. And then also another thing that I was looking at, particularly at the time, was this idea that history is actually not like a linear timeline or progression as we often look at it when we studied in school. But it's actually a web that spans, you know, both space and time. And, you know, one of the things that really made me think about that was this idea that, well, you know, when we were in school, we learned about what you in the United States probably at first learned about the Louisiana Purchase, you're right, where the United States bought so much land from France, which is fantastic, obviously, for the United States. And then maybe you learn separately about Napoleon conquering Europe, you know, great stuff as well. Maybe not so much. Who knows? Depends on your perspective. And then, you learn also in a separate, completely separate sort of window on history, you might learn about the Latin American wars of independence, Simon Bolivar, and all these people behind that. And what isn't really taught is that they are all connected, that, you know, France needed the money to wage Napoleon's wars and therefore sold Louisiana to the United States. And then France conquered Spain, basically decapitated the Spanish empire, and gave all these colonies in Latin America the chance to declare independence. You know, these are all connected. And I think the way it is often presented in school, it's like these are sort of separate windows that we study. And often these connections are missing. And so the whole website was built around this idea that all the articles, all the images, the timeline events, everything is connected behind the scenes as kind of a web, which obviously the internet is really good at through hyperlinks. But we built a custom content management system that essentially represents all of these connections in the database and therefore is able to automatically link between all the content that we have in order to make it really easy without too much human or editorial input to allow people to sort of do a deep dive on history and really discover all of those connections just by clicking on links on the website.


00:08:36.03

Eric Stano

Well, I love this notion of history being a web. And as I confessed to you before we began recording, I was a humanities student and publisher. And I also love the emphasis on the story part of history. I also have several jokes that I could make about your presumption of what Americans are learning, it historically speaking, but I'll lose half our audience if I make any of those jokes, so I'll just keep moving on.


00:09:06.07

Jan van der Crabben

Yeah.


00:09:05.57

Eric Stano

And I'd like to actually explore for just a moment, why you think it is, you mentioned access being one of the problems you were trying to solve, why do you think it's important for students or for, you know, citizens of the world to have access to history? Why is that something that animates you?


00:09:31.78

Jan van der Crabben

Well, I mean, you know, I think history has sort of a bad name in education, maybe also because of how it has traditionally been taught. You know, a lot of people think, or a lot of students in particular, I think, think that it's kind of boring. You have to learn all these dates, and it's, you know, dry. And it doesn't have to be. But I think that the key, if history is taught well, meaning not as a series of dates, but rather as something where we critically examine the past and try to understand what happened and why it happened and why people made choices, because in the end, history is about people, right? It is among all the scholarly disciplines that we cover in school, it is uniquely positioned to teach critical thinking, to teach students on how to examine sources, to understand, you know, who is presenting us this information that we have from history, what were they trying to achieve? You know, why did they say what they said or why did they write what they wrote, which may or may not be true, you know? And I think particularly in an era right now where we have social media, we have fake news, we have misinformation, and propaganda, and it's getting harder and harder to tell, you know, what is true. There's a trust that is breaking down between sort of, you know, the users of the internet and traditionally trusted institutions like newspapers and so on. I think I read the other day that I think 30% of US people under 30 get their news primarily through Instagram and TikTok, for example, you know?


00:11:08.25

Eric Stano

I think that percentage may, in fact, be small. I think it may be a great deal larger as a percentage of who's actually consuming their primary news from there. Yeah, but I agree with you.


00:11:21.98

Jan van der Crabben

And on the one hand, obviously, it's great that we have access to all these different perspectives. But on the other hand, I think what it means is we need to be so much better at detecting what is true and what isn't. Why is this person saying what they are saying? What are their motivations? Do I think this is true or not? Essentially, understanding the trustworthiness of sources, that is something that we need so much more than ever.


00:11:46.46

Eric Stano

Right.


00:11:48.38

Jan van der Crabben

And history is a discipline that sort of teaches us that.


00:11:50.48

Eric Stano

So much.


00:11:52.38

Jan van der Crabben

And so in order to, you know, to have a functioning democracy in which citizens can participate in an informed manner and a responsible manner requires the understanding of, you know, what can I trust and what cannot, can I not trust? And so history, actually, if you want to, you know, put that together, you can say history is actually, or the study of history is actually something that teaches us of how to be democratic participants in society, right? And it's therefore actually, yeah, important to preserve and strengthen democracy. And I think, you know, if you think about authoritarian countries, there's a good reason why they tend to very strictly control what the history books can say and cannot say, you know, that there's a good reason for that.


00:12:42.33

Eric Stano

That’s right.


00:12:43.80

Jan van der Crabben

And I think there's a big debate about that, and particularly in the United States, which is one of our biggest markets in terms of readers. There's a big debate about that in the United States at the moment as well from both sides of the political spectrum, which I think is a really important debate to have, right?


00:12:59.51

Eric Stano

Right. And in fact, I find myself along those lines, and I love everything that you've just said. I find myself, in fact, adding a nuance when I'm talking about critical thinking, and you know, having this appropriate level of skepticism when you're greeted by any story or any source. I actually find myself saying historical thinking and putting that forward because I think it sort of deepens the import of what people really need to do when they're consuming any information in the world we operate within. And it makes history so relevant to just the practice of moving through the world today. So, I love that.


00:13:39.35

Jan van der Crabben

Yeah, 100%.


00:13:41.81

Eric Stano

So, given the work that you're doing and your organization is doing, tell me a little bit about what it means from your vantage point to digitize something as deeply human as history.


00:13:56.18

Jan van der Crabben

Right. Well, I mean, so first of all, you know, as I mentioned, this was a hobby project, and I had started, and I primarily programmed the website and then started writing some of these articles. And I quickly realized a few things I realized. First of all, I realized that there was a ton of work and that there's no way I could get this done. Second, that I didn't enjoy writing those articles very much. Third, that I wasn't very good at it. And fourth, that I really needed some help. So, just after I launched, my grandmother had unfortunately passed away, so I inherited some money.


00:14:31.42

Eric Stano

I’m sorry.


00:14:32.31

Jan van der Crabben

And I actually took some of that money and paid for the first 100 contributions to the encyclopedia. So I contacted various history writers that I had found online who did good writing. And some of these decided to stay on. So, you know, unpaid afterwards. So I think really what it means is for us, it is a passion, right? So, everybody who works at World History Encyclopedia has done this on a volunteer basis to start with. We basically don't employ people outright.  For example, I worked on this for 10 years, sort of in the evenings and for fun, before I quit my job in games and then became a full-time employee. I wasn't the first. So, yeah. It also means that we essentially look at what does the curriculum need. And so we primarily look at the US history curriculum, both for the US history as the subject in US schools, right, as well as the US world history curriculum. But we also look at, you know, other curricula in the world, United Kingdom, Canada, Australia. And now that we're in other languages, we also look at the curricula in France, Spain, and Mexico. And so we take all of this, and we create this editorial plan of what is needed. What do people need to know? What is covered in schools, you know, and essentially, we then buy all these expensive books, the, you know, Oxford University Press or Cambridge University Press books, the ones that you know you might find in a library, but most people wouldn't buy for themselves. And then our writers, who tend to be historians or people who have studied similar disciplines, they take all of this research that we have in academic books and journals and turn it into an article that is not too long, like 2,000 to 3,000 words. It's easy to read. It's suitable for a high school audience, essentially. And, they try to highlight not only what the textbook wants, but they also try to highlight the interesting stories that are behind it and make it relevant to people who read it. So, that's essentially sort of the work we do. And then our editors they fact check everything. They make sure that all of the articles are properly referenced, that they, you know, the facts in there are correct, that they're coherent and coherent with the rest of the encyclopedia as well. And one of the biggest and the most important things that we check as well is that, is it well written. Is it interesting to read? Or is it just a dry list of the facts that, I don't know, the textbook might want? So that way, we ensure that whatever we publish is not only suitable but also engaging.


00:17:32.98

Eric Stano

Right. And given, I appreciate the walkthrough of what you all have to go through, and it sounds like it's a Herculean, borderline Sisyphian task to move through this huge corpus of content and distill it into something that is consumable and engaging and, you know, something that people are going to want to avail themselves of. Given all those efforts, what sort of guardrails, how do you go about preserving things like nuance and authenticity in a process like that? And you know and how do you evade the risk of losing some of that?


00:18:16.34

Jan van der Crabben

Well, I mean, we're all human, right? 


00:18:19.10

Eric Stano

Of course.


00:18:24.69

Jan van der Crabben

So there's always a risk of getting things wrong or having some sort of bias or perspective.

But I think what for us, what is really important is that our team is distributed literally all over the world. So we have people in North America and South America, Europe, India, Australia. So we have a very, very global team. And that really helps us to sort of weed out unilateral perspectives or biases that might be more prevalent in some parts of the world than others, because one of the things that is often leveled as a critique, of course, in history is that history often is presented in a very Eurocentric or Western-centric way. So we always try to avoid that. We also, when we look at the sort of history books that our authors and researchers get, we always want to make sure that they are not all like written from a Western perspective, but we always make sure that there is in the mix books that are written by you know historians, archaeologists, researchers who come from the countries that we cover in that sense. And essentially, that's how we try to do, we do our best to create an as impartial as possible, as unbiased as possible perspective that represents the current scholarly consensus. So, of course, sometimes you have, I don't know, exciting new research that's come out, and we might mention that. But, you know, we will then mention that this is recent research or something like that, because in the end, we know that our encyclopedia is used by tens of thousands of schools all around the world. And students sort of need to get the facts right and sort of also represent what the textbook says. So when there's new research that may you know contradict what the textbook says, then we will phrase it in such ways that you know there's a scholarly consensus is that, but recent research suggests that this or that may be true, you know. So we always try to nuance it in that way. And when there are historical debates about topics, you know, where historians themselves essentially don't agree like the researchers, then we try to also always represent the debate and the different arguments presented without trying to take a side as much as possible, of course.


00:21:00.48

Eric Stano

 Yeah, I would say just, it's interesting, and we could spend hours on this. It sounds like there is sort of, you're animated to combat in some sense that trope about how history is written by the winners


00:21:16.63

Jan van der Crabben

Absolutely.


00:21:23.64

Eric Stano

And provide you know alternative perspectives, surface the debates where there are debates.

and evade that, that sort of, you know, reflexive Western perspective on things. So again, we could spend many hours, I think, talking that through, but I want to shift topics a little bit and move toward, again, the fact that you we're already talking about the fact that this is all going online. You want, you know, everyone to have access. So you're digitizing all of this information, and you're digitizing it in a world that is now informed by AI. And I'm curious as to whether you see any risks in turning cultural heritage, broadly speaking, into AI data. How, in that context, are you balancing access with responsibility?


00:22:15.58

Jan van der Crabben

Yeah, I think there are two sides of this story. One side is, you know, obviously, they're the big AI companies. OpenAI being the most famous one with ChatGPT, Google, Claude, and the others. And then there's how we use AI internally at World History Encyclopedia and sort of the research AI that we've published on our own website. So I think these are two slightly different topics, and I want to address each of them separately. Initially, focusing on the big corporations that publish these AI models, I see that as a very big risk, actually, for cultural heritage, but for human information and knowledge in general, I think we are entering sort of an almost Orwellian time period where, in the end, a handful, and it's literally a handful, if not less, of companies control information retrieval for the world. And, you know, we can say, oh, you know, they are probably all good stewards of information, and they use algorithms that are neutral. Algorithms are never neutral. And we know that, and we can see that when we see how any algorithm change in social media suddenly elevates or suppresses certain voices. Right. And, you know, we've so also seen at the hands of, well, let's look at what Elon Musk is doing with his AI. He's very explicit about it. That's a great example, therefore. He says, you know, Grok is not woke. Grok is not representing woke views and so on. He's even now made his Grokipedia, which is like a rewriting of Wikipedia that is not woke. So, okay, Elon Musk is very vocal about his view and the fact that his AI is actually political in the sense that it represents a certain worldview. Now, the others are not saying this publicly, but I think that, you know, behind the scenes, who knows what's going on? And if we have, like, a few companies that everybody in the world uses to retrieve information, then we have a problem. And I think with AI, we are increasingly seeing that information retrieval with AI is so much easier than Googling something. So, if we have a handful of companies that are essentially controlling all information, we have a problem because it is so much easier to look for information using AI than to look at, let's say, a Google search and clicking through four, five, six, seven different websites to try to find what you want. And in that process of clicking through the websites, you would have encountered potentially other views. You know you would not necessarily have gotten this one coherent answer. But yeah, it is so much easier to go into ChatGPT, Gemini, whatever you want to use, and just ask a question. It'll give you an answer. It will sound reasonably correct. Yes, it will cite some sources. But you know, if you've got a correct answer, are you going to check those sources? Maybe not. So, it is something I think that is very concerning. And it's something that, you know, us as an encyclopedia, and any encyclopedia, for that matter, I think it's a question like, is maybe AI the encyclopedia of the future? Or is there a space for a separate encyclopedia like ours? I like to believe, obviously, that there is, because I think that some or many people still will want to follow a source that they trust. And I think the trust in AI is not quite there yet, particularly in the education space, which is where we've seen a lot of growth in the last few years, particularly since AI was launched. But yeah, I think there is a certain risk there with AI and how it's being used and how it is literally transforming the human information and knowledge landscape.


00:26:45.56

Eric Stano

Right. And now that we've sort of established a beachhead relative to AI in our conversation here, let's talk about some AI specifics that are related to the World History Encyclopedia and your recently launched History AI Chat. I'm curious what inspired its creation and how it's going to be different than other AI tools.


00:27:12.86

Jan van der Crabben

Sure. Well, I mean, so first of all, AI chat is how we called it, where we launched, which is an interesting thing, interesting story behind it as well. Because what we wanted to create essentially is an AI that would use the World History Encyclopedia data that we have from all of these articles and provide simple answers that are cited with a clear bibliography to clearly indicate, you know, which articles does this information come from, so that people could then click through. And we tested that with many teachers. So we have what we call the Educators' Council. It's over a hundred teachers, professors, and other educators who, you know, are willing to sort of test things for us and give us feedback before we publish it live. And so, the idea was to create this AI chat that would, you know, do exactly what I've just described. And with the feedback that we got is that you know, teachers love the fact that it was sort of providing the sources from our encyclopedia. But what they didn't like is that the answers were, you know, quite long, similar to what you would get when you type the same question to ChatGPT, for example. And so, for an educational context, teachers wanted shorter answers and more focus on the sources in which the information was found. So the fact that we called it a chat also, I think, put a few people off because they immediately connected it sort of with ChatGPT and the sort of dislike that they have towards it, potentially. Particularly in the education space, for good reasons, I guess. And so, in the end, what it has become since we made the initial announcement, but it's really become more of a research tool. So we have shortened the answers. You will now get a shorter answer from our AI than you would get in the sort of Google AI summary, if you were to type the same question. And we are making the source links much more prominent. And also one of the things that we've done is we've partnered with the Open University, which is a big university here in the United Kingdom, who operate the core database of open access journals, it's called. And that is a global database of open-access journals and research papers. And so our AI is integrated with that so that any question about history, where maybe World History Encyclopedia hasn't covered it yet or not in enough depth, the AI will actually consult the academic journals in order to then provide the answer, and obviously cite the academic journals as well. So in that sense, it has really now become a tool through which students are able to engage with a corpus of knowledge that is potentially way above their, sort of, academic level in an easy-to-understand way, because you know the answers, they still remain student-focused and appropriate for students. So, just because it finds the information in an academic journal doesn't mean it'll use the same language. So, that's kind of what we've created. And so far, the feedback has been really good.Of course, there are always people who hate AI, which I do get. But yeah, I think seeing it more as a research tool that helps to surface information and make sense of it in a quick way, rather than providing sort of comprehensive answers, is how you know we found it has become the most useful for teachers and students.


00:31:00.89

Eric Stano

And it sounds like you've, in some respects, constrained the information, the corpus of content on which you can draw to the encyclopedia and to the core. And I can intuit that, you know, the trade-off you're trying to make there is trying to diminish hallucinations or misinformation. Do you find that you've made any trade-offs in making that decision?


00:31:31.19

Jan van der Crabben

No, I think. So initially, we started to just use the World History Encyclopedia's content for the AI. And we did find that there, you know, on questions where we had plenty of information, the answers were great. However, once we asked it questions where we had little information or none, we ended up in sort of hallucinatory territory. And that's why we then looked at other sources. OK, so we need to find something that is comprehensive enough that it'll be able to answer any question about history, whether we've written about it or not. And so that's how we ended up with the core database of open-access journals. Because it contains, I don't know the number, millions of research papers. And that way, we've made sure that all the information that is sourced is highly reliable, as you know, for peer-reviewed journals and our own content. And equally, that it can answer absolutely anything. We also put in checks, of course, that, you know, because we knew that students would be using this, and we knew that students would ask stupid questions or silly questions, right?


00:32:44.36

Eric Stano

Right.


00:32:47.73

Jan van der Crabben

Questions where they are inappropriate as well, potentially.


00:32:51.19

Eric Stano

There are no stupid questions, gents.


00:32:51.37

Jan van der Crabben

No, no, no, that's true. Not stupid, but inappropriate.


00:32:56.47

Eric Stano

Okay.


00:32:56.86

Jan van der Crabben

Let's call it that. That's probably a much better word.


00:32:58.97

Eric Stano

Fair.


00:32:59.69

Eric Stano

There are those, yes.


00:32:59.96

Jan van der Crabben

And yes, there are those. Not suitable for school use, sort of thing. So, we did put a lot of guardrails in so that it will not answer essay questions, so that it will not, you know, answer inappropriate questions. It will not answer questions that are not about history, things like that, just to ensure that it is a research tool that could be used like live in a classroom, and it'll remain useful, you know, within the bounds that are expected for an educational environment.


00:33:32.77

Eric Stano

Right. Well, that's actually the perfect word to use to catalyze my next question. You've begun to talk about some of the boundaries you've created there in terms of the selection of that corpus of content and the way in which it will field certain questions. Are there other important boundaries, guardrails that you've put in place that you think are important for potential users to know about?


00:34:04.02

Jan van der Crabben

Well, I mean, we have put, what's the opposite of a guardrail? I don't know, openings as well. um So, for example, we know that a lot of our users are actually teachers, not just students, but teachers who want to either learn the history themselves or to prepare for a lesson and to look up some facts or whatever. And we've seen as well that, you know, many of the teachers actually want to use our AI tool to generate ideas for lesson plans or classroom activities. And, essentially there, what we've done is that if the AI detects that the question is related to, like, teacher preparation, it will reduce some of those guardrails and allow longer answers because a lesson plan that consists of five sentences is not very useful. So, you know, we've kind of looked at what are the use cases for this tool. And in each use case, we apply different restrictions because they just need it. And so that's one of the things that we've done recently, actually. Haven't communicated too much about it. So, I think in terms of public announcement, this is probably the first time there's an exclusive announcement for your podcast.


00:35:17.94

Eric Stano

You heard it here first, folks.


00:35:24.46

Jan van der Crabben

Yes! Exactly! So yeah, the sort of lesson plan idea generation is something we recently added in there. And that is a lot freer in terms of the interaction. And then the other thing that was really important to us as well is that there's context, so that the AI will remember the context of a conversation, which, you know, is, for example, not the case in Google search, where you just search each time it's a new search and get one answer. So we wanted to feel a lot more like a conversation that you would have in ChatGPT or Gemini, where if you use pronouns in your follow-up question, the AI will know who you're referring to, you know that sort of thing.


00:36:03.73

Eric Stano

Interesting. And yeah, I could send us down a rabbit hole relative to how the AI is able to suss out whether somebody is beginning to establish a lesson plan, and you know, engaging in that sort of exploration. But instead, I will take us down a different path because you've been talking a bit about teachers and, you know, your board of advisors. I forget the exact term you used. But you're also now beginning to talk about how teachers are using the chat. Say a little bit more, if you could, about the collaboration with teachers and the design with teachers involved in that effort. How did you find that the collaboration with teachers really shaped your final product?


00:36:54.97

Jan van der Crabben

Well, massively. It's due to the feedback that we received from teachers both before we launched and after we launched that we really were able to fine-tune the AI research assistant in order to provide exactly what is needed. Of course, a lot of us at World History Encyclopedia, not me, but some others have teaching experience and have, you know, taught in the classroom, either in schools or at university. But, you know, this is evolving all the time. And especially now with AI, we know for sure that education is changing dramatically, and it's all over the news as well, right? So we thought, okay, let's make sure that when we do this, we do it right and we provide teachers with what they need. And so before we even started the development, we assembled this educators council. So, you know, that was quite straightforward. We have a large newsletter where people have subscribed, and when people subscribe, they can choose what sort of category they fall into. And so we know who are our teachers. So we emailed the teachers, and we'll say, hey, you know, were we're doing this thing. We're building this council. Do you want to join? And as I said, I think around 150 or so joined. And then, we basically provided them with each version that we built and sent out, you know, a link, said, hey, this is what it can do. You could try these things. Here's a survey afterwards. Could you please fill it in? And that informed very strongly, especially these guardrails that we then put in because, you know, teachers knew what kind of questions the students would ask. That's how we then found certain weaknesses that we had to polish out before we went live. I think the whole process before going live with it was over half a year. So we had it sort of into a testing phase, in a private testing phase, for almost six months before we actually put it out on the Internet because we wanted to make sure that it's not like a flop. That it is actually working well and not, you know, making teachers' lives actually harder by giving students more opportunities to cheat or something like that.


00:39:16.89

Eric Stano

Well, that's actually, in fact, that leads precisely to my next question. Shifting from teachers to students, you know, the folks that teachers and you and I are ultimately really serving. What specific product choices in the World History Encyclopedia and the chat encourage questioning and exploration on the part of students, and maybe even discourage or at least maybe sort of write my homework, quote unquote, behavior.


00:39:53.08

Jan van der Crabben

Well, I mean, so the sort of the exploration and questioning is in the AI chat primarily done through two ways. So first of all, when the AI research assistant chat, we need to find a better name for this. When it gives an answer, it’s in the answer. Just like in our articles, we interlink automatically to the articles directly in the answer. Then underneath is basically a list of the citations of which articles were cited, both on our website and in external academic journals. And then underneath it also, this is AI-generated, it will provide a list of like three to five further questions that basically the user can just click on and it'll ask that question. And then that way, they can sort of go down that question rabbit hole as well. So that's essentially how we want to ensure that inside this AI tool we have, you know, something that essentially creates a dialogue, a learning dialogue that is more than just give me the answer, but it's like, here's a really short answer. You can read more about it here, or you can ask me another question. Because once a student starts engaging, it's no longer a passive exercise. Because if you want to ask a follow-up question, you need to think about it, right? You need to actually absorb the information that has been given to you. And it's not just regurgitating after all. So yeah, that's how we've made sure that whatever the students do with this AI tool, they are encouraged to continue asking questions. Of course, you know, if you want to know the dates that something happened, the AI is going to give you the answer. You're probably not going to need to ask a follow-up question. But then the same thing happens if you type it into Google. Google is going to give you the answer, right? So yeah.


00:41:49.53

Eric Stano

Right, of course. All right, well, let's shift topics a little bit, although still staying, you know, in the realm of student consumption. I want to talk a little bit about ethics, privacy, and academic integrity specifically. So when topics are contested, how does your AI present multiple perspectives without a false balance, and how are editorial biases audited? We talked a little bit about that, but I'd love to go a little bit deeper if we could.


00:42:25.05

Jan van der Crabben

Sure. So I mean, in terms of the AI how presents historical debates, unfortunately, AI being AI, it may give a different answer each time you ask it. And it may be more balanced one time, but then less balanced another time.


00:42:36.38

Eric Stano

Right. An unreliable narrator, as it were. Right,


00:42:43.26

Jan van der Crabben

Yes, exactly. It's not always the same, right? So what we've obviously done is that the sources that it consults will have, you know, at least they will be as balanced as possible, and in aggregate should represent the scholarly consensus as well as new research. So the AI tends to provide answers that are including both the scholarly consensus as well as any new research data that has come out in academic papers. It doesn't, of course, always perform perfectly because it is, after all, an AI. I always like to say AI is a little bit like an intern that you give a job to. You're still going to have to check what they do afterwards, you know. Don't take it at face value. It's like your research assistant. It's not doing their research for you.


00:43:34.94

Eric Stano

Right, precisely, precisely.


00:43:37.40

Jan van der Crabben

So I think in that sense, that's there. And obviously, in the way we've instructed the AI, we have told it to, you know, try to be as, you know, following the scholarly consensus as possible while also highlighting debates and new research. Now, it's AI. It sort of still does a little bit what it wants, but it tends to follow those instructions. I think in terms of the perspectives, that's one of the reasons why we chose the Open University's core index of open access journals, because that database is a truly global database of open access journals.


00:44:35.35

Eric Stano

Right. Right. Mm-hmm. Mm-hmm.


00:44:41.29

Jan van der Crabben

So you will find journals from all over the world in there, again, trying to avoid this overly Western bias. Of course, it is true that there are more journals in the West that are published on a regular basis than there might be, say, in Asia or on the Southern Hemisphere.


00:45:01.67

Eric Stano

Okay.


00:45:01.85

Jan van der Crabben

But these global perspectives are still part of that corpus of information. And so that way, we hope that it generally will represent a more global view on things than, say, a very Eurocentric or American-centric view. And in terms of the editorial biases, so obviously, you know, this is a database of academic journals. So this is not something we control. But when it comes to our own website and the content we use there, which obviously the AI does treat with some priority if it has relevant information on our website, Our editorial team, well, as I said, everything is checked by at least one, if not two, editors. So whenever there are questions, we will consult with our global team. And it generally what we find is that, you know, biases or sort of entrenched perspectives are quickly rooted out by just asking somebody who has a different, you know, background on this topic.


00:46:13.26

Eric Stano

Right. Just a different perch in the world, right?


00:46:13.71

Jan van der Crabben

Absolutely, absolutely. And one thing we find as well, so we do translate all our content into, I think, now it's 45 languages.


00:46:25.37

Eric Stano

Fabulous.


00:46:29.21

Jan van der Crabben

So not everything is translated into 45 languages at the same time, of course. But we have this huge group of translators who are mostly volunteers who translate into their respective languages. And many of them are professional translators who are a bit sick of the boring translations they have to do for their day job. And so they want to translate stuff they are into or interested in, and in the evening, some people are also historians from other countries, and things like that. So what we found actually, with this translation initiative that we started about five years ago, is that the translators might sometimes pick up on something and be like, oh, you know, the way this is phrased in English, if this was read in, I don't know, say Latin America, it would be understood as this or that. And so, sometimes, translators actually get back to the English language auditorial team and say, you know, maybe this could be rephrased a little bit to be a little bit more you know inclusive or to be more acceptable in this, I don't know, in my country, you know?


00:47:36.47

Eric Stano

Interesting. Right, right.


00:47:37.94

Jan van der Crabben

And so we found that, you know, even though, of course, we do our best to make the English original articles because we always publish in English first, that these are as unbiased as possible. There have been cases where certain phrasings or statements were seen as, you know, potentially offensive or potentially controversial in certain other cultures. And so we've then made adjustments accordingly.


00:48:10.10

Eric Stano

No, I appreciate the fact that you have that many and diversity of voices contributing to the overall editorial practice, even the translators in different regions of the world helping shape the content, I think, is helpful in ensuring that biases are muted, if not excised and entirely. If I could ask, how do frameworks like GDPR influence what's possible and responsible in AI design for you and for education?


00:48:47.19

Jan van der Crabben

Yeah, of course. I mean, GDPR is, of course, something that has affected every website. And at least you live in Europe, you get cookie pop-ups everywhere, which supposedly are going to go away in the next few years because the law is changing, but separate topic. So I think that the strictest actually laws in terms of child protection are actually in the United States with the CCPA. And I think there's a separate law in California as well. So what we've done is we've sort of tried to establish some sort of minimum boundaries. So we do not target our website at children. So follow the CCPA's law in the US. That means anybody 13 or under. So, you know, when as soon as we start collecting data, meaning user registers, particularly for the United States, we ask about the age. We collect the minimum data that we need, and we collect it as anonymously as we can. So, for example, we do track every question that is asked on the chat for analytics purposes, but we completely separate that data from who has asked it. So we can't put the two together. And they don't come in series. They are just like, randomly put together every question that was asked. And, you know, we just do this so that we can analyze, ok, so what is it that people are actually asking about without being able to tell who did the asking. And we do similar things on the website. So, you know, we do track a lot of information, for example, about what people are reading in order to be able to provide people with recommendations on what to read next, for example. But in that case, we actually store most of that data on the user's computer so that we don't actually have the data. It's the browser that will then do the calculations and send us back the information, so that essentially we are trying to separate the user's data from our servers as much as we can, so that our servers only get the data that they need in order to then process, say, the recommendation engine or something like that. So, you know, obviously, as soon as somebody registers, there's always personal data that we have to collect, like email address, name, et cetera. But we try to keep it really to a minimum. And we always follow the privacy laws, the strictest ones, as much as we can. So, yeah.


00:51:28.73

Eric Stano

Great. Well, let's actually now step back again from World History Foundation and the encyclopedia specifically. I would love your reflections just from, in a broader context of how you see AI transforming how future generations are going to engage with art, with history, and with culture. How do you see AI impacting that?


00:51:56.41

Jan van der Crabben

Yeah, I think that's the million dollar question, isn't it?


00:51:58.71

Eric Stano

Sure, certainly.


00:52:00.82

Jan van der Crabben

I can see a few possibilities, a few possible roads here. So, I mean, there's… one of them is kind of this dark dystopian future that, you know, I like to think in a way that, you know, if you think about cyberpunk, we're kind of living it now with a few mega corporations essentially controlling the world's flow of information and, you know, knowledge transfer and all of those things, what we just talked about before. And of course, you know, there are certain dystopian scenarios where people will essentially get all their information in aggregation, meaning they will no longer consult individual publications.


00:52:27.22

Eric Stano

Yeah, yeah


00:52:40.78

Jan van der Crabben

They might just consult AI in a similar way, you know, it's this unbundling, like we used to consume magazines and in print, you know, front to cover to back, you know. And we used to consume music albums in a package. And then, you know, with the internet, articles got disaggregated. You consume one article from this publication, one article from that publication, and you don't necessarily read a publication anymore, but this sort of mix of things, just as, you know, iTunes and other music players later have disaggregated the consumption of music so that you shuffle artists and albums all jumbled together. So in a way, we could say, OK, maybe AI is disaggregating text, so that instead of reading individual articles, we're only reading fragments of articles that then get rejumbled together by AI, which is kind of what's happening when you're asking a question. It will take phrases or even fragments of articles and then put them together, and you get your answer. So, is this going to be the way that people will consume textual information and also other information in the future? Because AI is getting increasingly good at producing audio and video nowadays, of course. So yeah, maybe. But at the same time, we're seeing as well that people seem to value human-made content a lot more than they used to. And there's a growing sort of consciousness about the importance of human-grade content. We see it a lot now on social media, where a lot of AI slop is sort of entering the social media feeds, particularly the younger generation. So, like Gen Z and Gen Alpha seem to have a somewhat allergic reaction to AI slop. So I really have the hope that more and more people will be wanting to see AI well, human-made content, and they will be seeking it out because yeah,  more and more content on the internet is being created by AI in all media, text, audio, I mean, you know, podcasts as well. Like our conversation here now has AI competition, right? So video, all of it.


00:55:12.10

Eric Stano

Ofcourse


00:55:15.03

Jan van der Crabben

So I think that there will be a place for both, in a way. And there's a reason why certain content will be made by AI, but there will always be other content that will be made by humans. So I hope that this is going to be the case and that the sort of cyberpunk dystopian future is not where we're headed, but I'm not sure.


00:55:36.89

Eric Stano

Right. No, and actually your, you know, analogies relative to consuming a magazine in its entirety or, you know, purchasing an album in its entirety. So you get that, you know, that holistic gestalt of something. And I have, for some reason, I immediately flashed to this idea that we're now living in a buffet world. Rather than one in which you are consuming a sustained, thoughtful meal that has been constructed deliberately for you.


00:56:09.32

Jan van der Crabben

Yeah, yeah, yeah. That's a good analogy.


00:56:09.69

Eric Stano

Yeah, it is actually, and I'm just thinking about how my more culinary-inclined friends view buffets. That is a truly dystopian idea, the idea that we're going to be living in a buffet model. But I appreciate that reflection.


00:56:29.02

Jan van der Crabben

Yeah, I think you take that even further, actually, because, yes, it's like a buffet, but a buffet is still like real food, right?


00:56:29.08

Eric Stano

Right. Right.


00:56:36.26

Jan van der Crabben

The way I see AI, it's like the ultra-processed food of information, you know? Like when you look at a lot these sort of cheap consumer food from the supermarkets, there isn't much like real food in there anymore. It's ultra-processed.


00:56:51.22

Eric Stano

Right


00:56:51.82

Jan van der Crabben

And I feel like AI is sort of the ultra-processed medium, you know?


00:56:56.73

Eric Stano

Absolutely. And I'm sure we could torture the analogy further and connect it to the obesity epidemic in the world, and etc.


00:57:04.03

Jan van der Crabben

Oh, it's a good one, yeah.


00:57:06.01

Eric Stano

But rather than do that, what I'd like to do is actually conclude by asking you, you know, if you could leave our listeners with one call to action. You know we've covered a landscape of issues. The need for people to be skeptical, the different ways in which you've set up guardrails, and try to encourage the right sort of critical behavior on the part of your users. We've really covered a lot relative to history and history within an AI context, but what's one call to action, the single most important step you think from your particular approach in the world that educators and innovators can take to really preserve truth and curiosity in the age of AI and and really bring to bear that that historical thinking that we were talking about at the beginning.


00:58:05.99

Jan van der Crabben

I think for me, the most important call to action for everybody, really, whether it's educators, you know, parents, general public, politicians, is to recognize that we're at an inflection point right now in the history of the internet, in the history of publishing as well. Publishing is in trouble, in big trouble. And that doesn't just include, you know, encyclopedias like ours. It includes absolutely everybody. There is, particularly now with AI, we've been seeing this, over the last few years, that was since the launch of ChatGPT and particularly since the launch of Google's AI Answers, that more and more people decide to quickly look up information and just rely on the answer that AI gives them instead of seeking out sources that they trust, that they have built some sort of relationship with, whether that is, you know, that they consult it frequently or that they, I don't know, subscribe if it's a newspaper or something like that, you know or a magazine. Basically, the more we collectively as humans rely on AI to provide us with the information that we want to get quickly and easily, the less money publishers, be they nonprofits like us, or be they, you know, I don't know, the New York Times or book publishers or whoever, the less money publishers will make, the less viable publishing is as a sort of a business model, as a career, as a writer or researcher. And the harder we will find it in the future to create, publish, and disseminate trustworthy information. Because if basically AI gobbles up everything, if it gobbles up the entire consumption of information, that's the sort of dystopian future I was talking about, then why would anybody publish any publication anymore? So, I think if we want to preserve a future, and this is where a future of, you know, trustworthy information, where we are not just fed the information by five companies in the world who may or may not have certain agendas, if we want to, you know, remain informed citizenship, informed public, we need to go directly to the sources that we think are trustworthy. We need to support them and not just rely on AI. Yes, AI is very easy. I do it too. I want to look up a little quick fact. AI gives me a quick answer. I'm satisfied with it. But I think as soon as we want to go a little deeper, we need to remember that AI is just scratching the surface. And that's something that if the next generations, the younger generations learn this, whether through experience or by what the teachers tell them or what they otherwise get imparted in school or from their parents, I think the reliance on human-made, fact-checked information is something that we mustn't lose.


01:01:18.10

Eric Stano

I 100% agree. As I've confessed, I was not only a humanities student, but a humanities publisher. So everything you've just said resonates with me. And I look forward to a future when we can enjoy a thoughtfully, you know, cooked meal rather than just be subject to the whims of a buffet. But I really appreciate all of your reflections. Again, it's an area that interests me personally. So I want to thank you, Jan, for spending some time with me today. And I want to thank our listeners for tuning in. This has been Tech in EdTech. I'm Eric Stano. Thank you for listening.


01:02:04.44

Jan van der Crabben

Thank you.