The Wild Wild Web

  • Surf
  • About
  • Sites
  • News
  • Recent
  • Pending
  • Graveyard
  • Keep up-to-date with the latest news of this category.
    | Search My Site
    The new X/Twitter algorithim is hard to predict, but I’ve had one go viral with over a million views now, a quote-tweet of a cool demo video of Apple’s website builder from 2009, with t…
    | Search My Site
    I’ve been thinking about why some people can track their progress on goals without going insane, while others turn into the guy who weighs himself four times a day and has a panic attack when his Fitbit dies… You’d think there’d be some obvious personality difference: anxious
    | CORE
    On 3 December 2025, CORE (COnnecting REpositories) hosted the latest live training session in our webinar series, Become a Professional User of the CORE Dashboard. We extend our sincere thanks to the many participants who joined us from across Europe, Africa, the United States and beyond. The high l [...]
    | Marginalia Search
    The search engine recently exposed a fair number of new tools for custom filtering to the API consumers and users of the new UI. This was originally going to be an incredibly chaotic update, both annuncing the new features and doing a technical walkthrough of the changes but that ambition turned out [...]
    | CORE
    CORE Announces Strategic Support for the Creation of the Ukrainian open repositories community Milton Keynes, United Kingdom – 27 November 2025 CORE is pleased to announce its formal support for the creation of the Ukrainian open repositories community, a national initiative led by the Scientific an [...]
    | Kagi
    *TL;DR* Today we’re releasing two research assistants: Quick Assistant and Research Assistant (previously named Ki during beta).
    | CORE
    The Open University is the project coordinator for the 2-year CHIST-ERA funded SoFAIR project which aims to make research software a first-class, FAIR research object (Findable, Accessible, Interoperable, and Reusable).  We are excited to share that  our paper, “Identifying and Classifying Software [...]
    | CORE
    This International Open Access Week, the global research community is asking a vital question: Who owns our knowledge? At CORE (COnnecting REpositories), our answer is clear and unapologetic:We all do.  For over a decade, CORE has stood at the forefront of the open access movement, not as a passive [...]
    | Marginalia Search
    One of the big ambitions for the search engine this year has been to enable searching in more languages than English, and a pilot project for this has just been completed, allowing experimental support for German, French and Swedish. These changes are now live for testing, but with an extremely smal [...]
    | Mwmbl
    This article was originally posted on my personal blog on 2nd August 2025. Dear friends, I am constantly besieged by the feeling that I am not doing enough. A genocide is unfolding before our eyes. I feel the guilt with every mother holding a starving child, with every doctor killed, with every jour [...]
    | Marginalia Search
    The Marginalia Search index has been partially rewritten to perform much better, using new data structures designed to make better use of modern hardware. This post will cover the new design, and will also touch upon some of the unexpected and unintuitive performance characteristics of NVMe SSDs whe [...]
    | Mwmbl
    It’s been so long since we’ve had an update on the blog that people are often confused as to whether the project is still active. It definitely is! I’m just bad at updating the blog. Most of the updates have been going to the Matrix channel. So an update is long overdue. Most of the recent work has [...]
    | Scryfall
    Scryfall is proud to announce that we’ve entered into a new partnership with Cardmarket. In the coming weeks, you should see a lot more richness in our available data for European pricing.
    | Marginalia Search
    As some of the work planned for Marginalia Search this year has been progressing a bit faster than anticipated, there was time to implement an unplanned change. This post details the implementation of a system for detecting when servers are online, to avoid serving dead links and improve data qualit [...]
    | Marginalia Search
    The most recent change to the search engine is a system that profiles websites based on their rendered DOM. The goal is identifying advertisements, trackers, nuisance popovers, and similar elements. The search engine already tries to do this, but isn’t very good at it because it’s only looking at st [...]
    | clew
    The web is mind-bogglingly huge; let's look at how personal websites can thrive and interact despite that.
    | Scryfall
    Scryfall now offers a search term for cards that use the default frame. That is, cards that aren't showcases, borderless, extended art, and so on.
    | clew
    While on a fourteen-hour international flight, I finally managed to come up with an architecture for Clew's web crawler that I'm happy with. Here's the run-down.
    | clew
    I believe I've reached a point in Clew's development where, armed with the knowledge I've acquired from months of crawling sites and using that data to search the index, it's time to wipe the index and start over.
    | Mwmbl
    By many measures, Mwmbl is doing great. We have indexed over half a billion pages, we have over 4,000 registered users, and over 30,000 curations from those users. Our volunteers are crawling around 5 million pages a day. But the score that I care about most right now is NDCG. This measures the qual [...]
    | Lumen
    Lumen has identified new signs of a coordinated and potentially automated fraudulent DMCA takedown campaign relating to articles about a Russian Olympian.​ Building on work done by past Lumen team members and documented in previous Lumen blog posts, the evidence presented here sheds light on previo [...]
    | Mwmbl
    It’s two years since we launched Mwmbl, the open source, non-profit search engine, on Boxing Day 2021. A good time to take stock of where we are and where we’re going. We’ve indexed over 100 million pages Thanks to our volunteers, who crawl the web using the Firefox extension and command line script [...]
    | Mwmbl
    Mwmbl is the first search engine to allow users to change the search results: You can add results, delete them, and rerank them. The changes you made are saved instantly to the index and will be shown to other users who run the same query. But what is the point of users changing search results? Th [...]