From Research to Search

From Research to Search

The evolution of how we search online

Every time you search the web – be it for the latest news on your country’s economy, for how the weather will be tomorrow, or for what happened during the Enlightenment – you're taking part in one of the greatest scientific revolutions of our time: search engines.

1989 was not only a year of geopolitical revolutions, it was also the year of the technological revolution called the World Wide Web. British scientist Tim Berners-Lee invented the World Wide Web while working at CERN, the European Organization for Nuclear Research. His goal: Enabling scientists from around the world to share their information in a structured and interoperable manner. This advancement of the internet redefined an old question: With an exponentially increasing amount of information online, how do people find what they need?

Starting up the first engine

“Consider a future device for individual use, which is a sort of mechanized private file and library. (…) It is an enlarged intimate supplement to his memory." Shortly after the Second World War, the scientist Vannevar Bush was already introducing the idea of a universal body of knowledge for all mankind as well as a fast and virtually limitless storage and retrieval system in his famous essay “As We May Think”.  

Fast forward 45 years to 1990. Enter Archie, the first search engine – created one year before the world’s first website went online. Archie worked as a database of web filenames which it matched with specific search queries.

The 1990s then became the Wild West of internet search, with an explosion of different search engines, all of them trying out different angles: Infoseek (1994) introduced the advertisement model, using a cost-per-thousand-impressions metric. Yahoo (1994) added manually written descriptions for each web address. AltaVista (1995) was the first to allow natural language queries. Backrub (1996) was the first engine to analyse backlinks pointing to a given website, and would turn out to become one of history’s most important tech stories: Seeing as no one wanted to buy their PageRank technology, Backrub’s two Ph.D. student founders Larry Page and Sergey Brin launched their own company in 1998 and named it Google.  

The first Russian search engine Yandex was originally a software that searched through the Russian translation of the Bible, before switching to internet search in 1997. MSN Search was created shortly after in 1998. Now known as Bing, it’s the second most popular search engine in the world (behind Google). The new millennium started with what would become the Chinese search giant Baidu in 2000 and also introduced new alternatives like DuckDuckGo (2008), focusing only on privacy, as well as the green search engine Ecosia (2009).

How does a search engine work?

People standing at a bookshelf

Even though search makes up 29% of the world’s internet traffic, we seldom know how they really work. Like a car engine, a search engine consists of different parts, the most important ones being: a web crawler (also commonly referred to as a search engine bot or spider), a data structure called an index, and a set of search and ranking algorithms.

Web crawlers browse the web automatically, making copies of every webpage they visit and adding its web address or URL (Uniform Resource Locator) to a data structure called an index. They then repeat this process with every link they find on the pages they visit. The crawlers perform a kind of never-ending "rinse and repeat”, which rapidly increases the size of the index.  

The search index then acts as a huge library of all the information the web crawlers have collected. In addition to the web address, it also stores relevant key signals like keywords (what topics does the web page focus on?), the freshness of the page (when was the page last updated?), the content type (what type of content is included on the page?), as well as the previous user engagement (how many people interacted with the page and in which way?).

If we think of the search index as a library, then the set of search and ranking algorithms is the all-knowing librarian who can tell you exactly where to find the information you’re looking for. Whenever a user types in a keyword, the algorithms sort through the index in a fraction of a second, collecting the matching results and ordering them according to how well they fit. It’s the equivalent of the librarian who presents you with a pile of books, nicely ranked according to how useful the information might be for you. To deliver the best result, the set of search and ranking algorithms looks at many factors such as the words being used, the quality of the web pages and the sources as well as the user’s location and settings.

Users receive different results when using different search engines for three reasons. The first is the index used. The second is the difference in search and ranking algorithms. The third is personalisation. Personalisation algorithms are like librarians who not only know all the books in the library like the back of their hands, but also remember past books you’ve been interested in and adapt your book pile accordingly.  

Wolves in sheepskin:
Trading privacy for a great user experience

Search engines are rated on their user experiences: If you can search easily, receive your results quickly, and get the information you need you’re likely satisfied with the search engine and will probably use it again, and again, and again.  

But that creates a huge problem. As you’re so focused on the instant user experience, you're losing sight of what’s happening behind the scenes: Your privacy is being destroyed and your data is being sold to third parties. It's all being done for marketing and advertisement purposes. Companies like Google try to distract from this pain point by highlighting the fact that they don’t sell better placements in search results, but the main problem – your privacy – remains unsolved. You wouldn’t want the all-knowing librarian who remembers all of your personal book selections talking about your preferences to people that offer him money for information, would you?

Redefining how we seek information

Knight protecting a PC

The benchmark for great search can no longer just be the user experience. Search engines need to protect your privacy as well. Not only the definition, but also the general approach to searching for information on the web ought to be re-evaluated – giving way to even better technology.  

Both the world wide web and online search were invented by scientists trying to improve the way information was stored, processed, and accessed. Xayn does just that: Using the latest AI research – such as edge AI, federated learning, and homomorphic encryption – we developed a new kind of search experience.

Our search engine offers a great user experience with personalisation while protecting your privacy by leaving all of your data on your device.  

To make finding information even faster and easier, we’ve also incorporated the Home Screen that offers you new content suggestions based on your past searches. You can organise your bookmarks in individual Collections. Xayn lets you find all the information you need in this vast storage of knowledge called the internet, whilst protecting your privacy with every search.

Privacy shouldn’t be a luxury, nor should it be reserved for people that have something to hide. It’s a fundamental human right that should be granted to all, especially when doing something as basic as searching for information online. We know privacy-protecting search can be combined with a great user experience and search personalisation.  

Welcome to the next generation of privacy tech.

© Photo by Scott Graham on Unsplash