So I want to call it out: where we are today is bullshit. As engineers, we can, and should, and will do better. We can have better tools, we can build better apps, faster, more predictable, more reliable, using fewer resources (orders of magnitude fewer!). We need to understand deeply what are we doing and why.
From there it scores each of those elements by number of:
* characters by the hundred (up to 3)
These scores also count towards the parent elements, but scaled by depth (especially beyond depth 2)
From there it scales these scores by how much of the text is in links (a likely indicator of navigation) and tracks only the top 5 candidates.
Then it looks to see if it captures more useful text by looking at ancestors and/or siblings.
Layer 3 considers elements which:
* is not marked hidden
* doesn't look like a byline/the author name (those are rendered seperately)
* (optionally) based on the absence/presence of certain classes, unless it's in a <table>
* is a <section>, <h2+>, <p>, <td>, or <pre>
* inline-containing <div>s (rewritten to <p>s)
* or <div> containing a single <p> (as on mobile.slate.com)
* and has more than 25 characters
@bleakgrey The next layer might (if configured to do so) consider giving up if there's too many elements on the page.
Then it removes all <script>, <noscript>, and <style> tags, before replacing "chains" of <br>s with a <p> containing it's subsequent siblings and replaces all <font>s with <span>s.
From there it examines the metatags for useful information (filling in any missing excerpts), and after layer 3 it (unnecessarily here) makes links absolute and removes any classes.
The first layer uses document.write() to drop the existing markup from the page. Then in addition to the page's text, it adds back in it's extracted title and byline. It also computes an estimated reading time @ 200 words-per-minute, before removing any attribute besides "src" and "href" and annotating the page with a theme class.
That isit looks through any visible <p>s (not in an <li>), <pre>s, or <br>-containing <div>s with more than 140 characters, discards some depending on class names, and sums the square roots of any remaining character count.
If so it sends a message to the UI telling it to show the button offering a reader mode.
Thanks to @bleakgrey (and I think I recall someone else being involved), a new Odysseus release is coming out soon support a "reader mode".
I find it rediculous I feel need to support this feature, it's saying "webdevs are doing such a poor job that I need to offer to clear away their mess!"
In celebration of this I will describe how this code (the same as used in Firefox and Pocket) works.
everyones shitposting about rich text not paying attention to this article in the New Yorker about the fediverse!
> Wait, isn't that [the philosophy behind #zig] also the philosophy behind Rust?
Yeah, I guess so—which might explain why I like programming #rust so much.
On the other hand, Rust also tries very hard to empower non-systems programmers (including me!) to program at a very low level, which seems pretty different from Zig
Update on Reganam:
- I've added some styling to it
- The research tab now has some technologies to research
- There are storage upgrades as well
It is almost ready for release! Only some helpful tooltips and text styling to go for 1.0.0.
it's an ill wind that blows no-one any good.
maybe they can go full-stack FOSS in response
@rain Hmmm, the WebKit blog does state that compilers change register allocators frequently...
@hay Well, I guess Silicon Valley benefits from the illusion of constant change as it helps them better swindle investors.
Posting mostly about how free software projects work, and occasionally about climate change.
Though I do enjoy german board games given an opponent.
For people who care about, support, or build Free, Libre, and Open Source Software (FLOSS).