Show HN: An extension to track your Wikipedia adventures

178 points
1/20/1970
16 days ago
by demegire

Comments


IncreasePosts

I wrote a plugin just like this, and every day, I have it present me with a quiz based on a summaries of the first paragraph of the pages I read over the day.

Basically, I was reading way too much Wikipedia and not actually storing much information, so I have the extension shame me if I don't remember what I read.

16 days ago

cooper_ganglia

That's genius. Have you published this as an extension? I'd love automatically-written flashcards to quiz myself on what I've read that day...

16 days ago

nullindividual

We absorb the words in front of our eyes even if we're not conscious of it. A topic that you glossed over may come up in another context and remind you of that wiki article.

It shapes who we are.

And sometimes knowledge of the existence of a topic is valuable.

15 days ago

dr_kiszonka

I remember seeing an article about it in HN.

15 days ago

graypegg

I would love to mess around with this if you've published it somewhere! Folks would love it as a Show HN I bet too!

15 days ago

phailhaus

Have you tried out a tangled-tree visualization? [1] I've found it to be super useful when visualizing these sorts of relationships in a compact way, and it naturally sorts the data topologically.

[1] https://observablehq.com/@nitaku/tangled-tree-visualization-...

16 days ago

throwaway444441

Very cool! One small point of pedantry:

> A tree with multiple inheritance (sometimes called tangled tree) cannot be represented by using a classic tree visualization. It is technically a directed acyclic graph (DAG) with one (or more) nodes identified as root.

What is the difference between a DAG and a tangled tree? Isn't any DAG a tangled tree? I don't see immediately why a new definition is required.

16 days ago

S33V

I'm not entirely familiar with tangled trees, but it seems like one of the larger differences is that a tangled tree isn't necessarily acrylic. For this example, someone could navigate away from one page, but potentially be linked back to it later down the adventure.

16 days ago

throwaway444441

> A tree with multiple inheritance (sometimes called tangled tree)

By the author's definition, multiple inheritance prohibits cycles. DAGs can be modeled as tree with back edges to non-ancestors. So I'm pretty sure tangled tree = DAG.

> For this example, someone could navigate away from one page, but potentially be linked back to it later down the adventure.

Good point, maybe "tangled tree with back edges to ancestors" is the really correct model for what the author wants. The key point of the visualization is to highlight the deviation from a standard DAG or tree.

15 days ago

phailhaus

The author already says that:

> It is technically a directed acyclic graph (DAG)

But DAG's don't have 'roots', they just have nodes. The concept of roots makes it a tangled tree.

15 days ago

wiseowise

Is there a source code for the visualization?

15 days ago

phailhaus

That's a live notebook! If you click on the cells, you can see the code that was used to create it, like a Jupyter notebook.

15 days ago

wiseowise

Ah, thanks! Wasn't that obvious on mobile.

14 days ago

jack_riminton

This looks really neat

16 days ago

bloopernova

I feel like there's a lot of knowledge or information that we're "leaving on the plate". For instance, the sites we visit, the files we edit, the branches and PRs we create, etc etc. All of that is related, but it feels like that context is being lost or discarded.

An example might be: I have to include new AWS resources in a deployment, so I look up information about them, find examples and read about potential problems, security information, etc etc. That then becomes edits in a terraform file somewhere, with a Jita ticket, my own knowledge database (Emacs org-roam files in my case, Obsidian etc for other people). Then the feature branch gets a PR to dev, we might discuss changes in Teams (ugh) or a meeting. All of that seems ripe to be linked together conceptually, but the computer has no way to do that.

It makes me wonder if that could be fed into the right machine learning thing to at least start tracking this sort of work stuff. Heck just synchronizing my Firefox bookmarks (ff lets you tag your bookmarks) with my org-roam instance's tags would be useful. Tagged files in my knowledge base could be automatically linked to similarly tagged bookmarks.

16 days ago

surfingdino

I like these pieces of my digital footprint to not be connected. There is no need to track everything.

16 days ago

idle_zealot

Do you not want them connected, or do you not want the connections shared and potentially used against you?

16 days ago

surfingdino

I like them not to be collected or connected. I don't trust those collecting such data.

15 days ago

ranger207

I typically write up all of that in my documentation somewhere. Stuff like "first thoughts are this approach might work, talked to person who had this idea, looked at this link and found this info, decided to go with this approach because of factors x, y, z". This isn't the primary user-facing documentation but a subpage or something that's helpful a couple of years down the line

It's like a book titled "A History of [Object]" that traces what solved problems before the object, issues with old solutions, the emotional, financial, etc state of the inventor, why they chose this solution over that one, how the object was adopted and improved afterwards, other inventions spawned off the object, etc. Capturing the history of the object requires capturing the context around the object too

16 days ago

joshuahutt

My thoughts on this are to slow down and document and explore that knowledge and information. If it is really valuable, the "loss" in efficiency from slowing down will be offset by the gain in skill/utility from really grokking the stuff.

If it's not...then there's really nothing "left" on the table — if ever turns out to be valuable, you'll probably come across it again, when needed.

I constantly get a similar feeling. I'm speeding around from task to task, just grasping enough to get the current task done so I can get to the next one and the next one...

And somehow this is value-creating? Apparently it is, but it seems almost accidental, at that rate.

I'd rather slow down and appreciate the value as it moves through me, into whatever I'm doing.

I usually get more from the process, at the same time.

15 days ago

joshuahutt

It's like...if "less is more," then "more is less."

Reminds me of a floating point number. The bigger or smaller they get, the less accurate they become.

If you're chunking on a ton of data and tasks, you're getting less out of it. At a certain point, none of it even seems to enter your brain at all.

15 days ago

sslayer

Basically, this is what college should be teaching you - how to research. What good does are useless facts? I don't want to walk around cluttered with a dictionary - I want to know where to look in that dictionary. Obviously in the sciences there are facts that you should know, but even with math, its more about how to derive the formula, than actually memorizing it. I mean, their called "Research Papers" right?

15 days ago

joshuahutt

Totally agree. I remember the phrase “learning how to think” being thrown around.

I also remember not being explicitly taught that.

It sort of seems like trying to find enlightenment by chopping wood and carrying water at a monastery.

If critical thinking is something that spontaneously emerges in a learning environment, maybe we shouldn’t sell it as a benefit. “Some students experience deep insight into the nature of the mind. Results not typical.”

10 days ago

steezeburger

I've been thinking of something like this since LLMs became popular. I've toyed around with some proof of concepts, but haven't had the time or motivation to work on it lately. I love the idea of tagging everything and showing connections when you're searching for things. Also semantic search would be great, like "blue website with information about databases I read last week" would be super powerful in my opinion.

I really love the idea of digital knowledge bases, but as you said, I think we're leaving a lot on the table. I need to get back to my project of a user-owned-data knowledge base.

16 days ago

jskherman

What kind of approach did you take? I was thinking along the lines of requiring something like rewind.ai or some program that autoscreenshots your screen at a set interval (or originally a recorded video split into several images later) and having a vision-capable model (particularly specialized in UIs) describe these set of images in order to build a dataset of images-tags-description and the like.

16 days ago

jskherman

There's also libraries like trafilatura in Python featured here in HN some time ago that could extract content from websites to help augment the data.

16 days ago

bongodongobob

I've had similar thoughts but over time you'd just end up with a private copy of the internet. You'll still have to search for the information anyway, so I'm not sure what the benefit is. Searching your knowledge base for "the thing I did yesterday" vs "how to sync Azure to AD" seems basically equivalent to me. You're just creating yet another thing to search.

16 days ago

bloopernova

That's a good point, you'd absolutely want to get away from adding another burden to the human.

Seeing relevant bookmarks when I'm viewing a specific note in my database could be useful though. And finding pull requests related to a subject might also be useful.

So the idea would be to reduce the number of searches performed by the human. Automate and enhance rather than dump and forget.

16 days ago

eichin

Yeah, but your private copy would be more like "The internet: The Good Parts" (assuming you had a way to not store what you immediately dismissed as garbage; maybe only include pages with a dwell time of 15-30s or more.) That's enormously valuable (and why I've implemented it before - but in conkeror, which didn't survive the death of xulrunner - so now I use pinboard and text files and logseq, which are pretty good but a lot more work.)

16 days ago

happypumpkin

To whatever extent something like this can be done locally, I'd probably pay a monthly sub for it if its good enough. But I wouldn't want any of that leaving my machine, we get tracked and profiled enough as-is imo.

16 days ago

bloopernova

Yeah, this is worth at least as much as Kagi or Copilot is to me right now.

16 days ago

_boffin_

Working on something like that, but there’s still a good amount of work to do

16 days ago

wasteduniverse

[dead]

16 days ago

[deleted]
16 days ago

[deleted]
16 days ago

bawolff

That's cool.

I do find it ironic though that wikipedia is one of the major sites with the least amount of user tracking, and then users decide to implement the tracking themselves.

16 days ago

nullhole

That is funny, though this is more tracking-by-users than tracking-of-users

16 days ago

BlueTemplar

16 days ago

eichin

tracking for-the-benefit-of users, which only has to be done by the users because no services can be trusted :-)

16 days ago

non-

This is cool, I love how it shows you all the branches you've followed in actual tree diagram.

The concept reminds of https://browser.horse/ a bit, which has the concept of "trails" that track any links you visit. Great for research projects.

16 days ago

BlairCurrey

Cool tool. Might be cool to make something wikipedia agnostic. Sometimes I manually create such a thing via obsidian but its kind of tedious. It's interesting how sometimes different starting sources read far apart in time lead to rabbitholes which cross paths.

This reminds me of a python scraper I wrote a while back when I was learning to program - Youtube rabbithole: https://github.com/BlairCurrey/youtube-rabbithole

It basically just follows the next recommended video, recording the path along the way. More about tracing the youtube algorithm than tracking your own journey.

16 days ago

starkparker

Looking at https://github.com/demegire/wiki-journey/blob/main/firefox/c...

It seems likely that the extension could be customized to any Mediawiki instance? As an admin I'd love to be able to use it elsewhere. This looks like it could be a great tool working with test users on stuff like information architecture, to see the path of how they found information. (I know there are better tools for that, but something that focuses tightly on wiki interactions would be useful to me.)

16 days ago

KaiMagnus

That’s a very cool project and I wish something like this would exist for all websites.

A few years ago I did a university project where we looked into (internet) research and how information discovery and gathering could be improved. (https://www.kaimagnus.de/projects/halo)

There we had the concept of a similar looking tree. Users could then come back to their exploration and take notes, prioritize and sort.

It was only a concept back then, so it’s nice to see it in action.

16 days ago

jskherman

Similarly, per chance, is there also an extension for the browser to show a tree graph or a directional node graph like in Obsidian for the sequence of websites you visit in your browser history to see your whole rabbit hole on the Internet? I'm pretty sure the tech is already used by the advertising industry.

16 days ago

CalRobert

Suddenly I am reminded, for the first time in maybe 2 decades, that "surfing the internet" was once a term used specifically for this kind of rabbit-holing

15 days ago

steezeburger

This is really cool! It would be super neat if the nodes were more interconnected, forming a fully connected graph rather than just a tree.

16 days ago

sixo

This is tracking the user's trajectory through the site, necessarily a tree, not the network structure of W itself.

16 days ago

random3

How is the user journey through the site necessarily a tree? What prevents the user to create loops through their journey?

16 days ago

lyk2005

Not an absolute statement, just that it resembles a tree more closely as you branch off slicking on hyperlinks.

16 days ago

steezeburger

Oo, yeah, that's a good point! I totally see why it was done this way now.

Though I do still think it would be cool to have a toggleable overlay or something that shows the cyclic connections!

16 days ago

phailhaus

That would be technically more "accurate", but it doesn't yield more useful information and ends up being harder to read.

16 days ago

ldayley

Interesting. I’ve been using the Wikipedia iOS app (which saves history by the day) to keep track of my personal rabbit hole journeys…

16 days ago

russdpale

This is one of those ideas where you think "why the hell didn't I think of this?"

16 days ago

[deleted]
16 days ago

RockRobotRock

POV: It's 4 AM and I still can't fall asleep

15 days ago

jsunderland323

This is great! Will try to give a try later

16 days ago

kcarter80

Narrator: they didn't.

16 days ago

KaiMagnus

Have to admit I'm slightly disappointed that the FF version only shows two users still and one of them is me.

16 days ago

krylon

I didn't know I needed this.

15 days ago

noashavit

A graph of Wikipedia rabbit holes

16 days ago

serenayakgun

wow, this is great

15 days ago

nathell

Obligatory xkcd: https://xkcd.com/214/

16 days ago

AdmiralAsshat

My Wikipedia searches are like my porn searches: no one needs to know about them, least of all myself. They bring only shame and remorse.

16 days ago

[deleted]
16 days ago

HeySVspackos

[dead]

16 days ago

xcdzvyn

This is fantastic. Great idea!

16 days ago

mistrial9

[flagged]

16 days ago

[deleted]
16 days ago

[deleted]
16 days ago

nirmel

I'll mention that I made what could be described as a AI-generated Wikipedia alternative, where you can generate articles on anything with text links on terms that link to new articles that get generated considering the context of the the article path that got you there. I reckon Wiki-enthusiasts won't be disappointed: https://anylearn.ai

16 days ago

dcsan

Awesome!

15 days ago