Hacker Newspaper: A Case Study in Signal and Noise

The inimitable Giles Bowkett is the mad genius behind an awesome little Hacker News filter called Hacker Newspaper. It’s quickly become the only way I read Hacker News. It cuts out a lot of the bullshit and presents what’s left in a meaningful, easy-to-read fashion.  In essence it takes a channel, removes noise, and then improves the signal somewhat by reshaping the message to be easier to understand.  Noise down, signal up, SNR goes way up.

Compare this to horrid, ad-laden spam blogs that scrape other sites for original content and then wrap it ads.  These take a signal and, without modifying the signal at all, add noise.  This dilutes the signal, causing the SNR to go down.  Sometimes they even combine the signal from multiple sources, adding interference and confusing things even further.

In a recent post, Bowkett ponders what the difference is between his tool and site-scraping spam blogs.  I’m fairly sure that our conclusions on this point are isomorphic, but just framed in different terms.  In my opinion, it all comes down to signal and noise.  Hacker Newspaper remains true to the original signal, but filters out a lot of noise.  This means that it’s unequivocally a better source for that signal.  (Why listen to Hacker News when Hacker Newspaper provides the same signal, only stronger?)  Spam blogs, on the other hand, are reverse-filters that add noise and cause signals to bleed into one another.

To put it another way, Hacker Newspaper is a filter and spam blogs are anti-filters.  I think this not only explains the difference between them, but also hits to the heart of why people express moral outrage at spam blogs and (usually) don’t at things like Hacker Newspaper.  People, especially those in the hacker community, appreciate good filters.  They appreciate them so much in fact, that their willing to totally ignore any principles they may have with regards to “stealing content” if what’s done with that content makes the signal come through stronger.

On the flip side, hackers are rightfully annoyed by obfuscation and irritated when someone willfully makes our lives harder just to make a buck.  This is essentially what happens when spam blogs wrap a signal in noise that profits the owner of the spam blog.  The message gets harder to interpret just so someone who’s too lazy and useless to make their own content can make some easy money on someone else’s message.  It should go with out saying that this makes spam blogs unmitigated bullshit that are full of naught but evil and fail.

People’s reaction to this lameness often gets expressed as moral outrage.  (“How dare you steal content!”) I think it gets framed this way because, given the choice between making an argument from utility and from morals, most people prefer the moral high ground.  After all, righteous indignation is much cooler than simple annoyance.

That’s an over simplification of course, but most people’s principles, when it comes to things like “stealing of content” (scare quotes employed because the theft metaphor is a pretty inaccurate one when it comes to digital content) make a lot of allowances for adding value.  For many people, the principles are genuine, but misstated.  The objection isn’t to reusing content, it’s to diluting someone else’s hard work.  Or to plagiarism for the sake of profit.  Or simply to the loss of value that occurs with any loss of SNR.

In other words, people don’t really mind it when someone reuses content, they mind when someone ruins content.

Determining Number of Factors in PHP

I’m currently learning PHP.  It’s not the first dynamically typed language I’ve worked in (I’m pretty handy in JavaScript), but I definitely do the lion’s share of my hacking in statically typed languages.  (These days, mostly C#).  The below code is hardly revolutionary, but it works well enough.  And suggestions for improving it are most welcome.

	function DetermineNumFactors($numberToCheck)
		if($numberToCheck == 1)
			return 1;
		$numberRoot = sqrt($numberToCheck);
		$numFactors = 2;
		for($i = 2; $i < $numberRoot; $i++)
		    if($numberToCheck % $i == 0)
		//we have all the factors up to, but not including the square root
		$numFactors *= 2;
		//now we have all the factors above and below the square root.
		if(floor($numberRoot) == $numberRoot)
			//if root is a natural number, then it&#039;s a factor
		return $numFactors;

One notable bit is the floor/ceiling comparison.  I needed to check to see if the root is a natural number.  (If the square root of the number is a natural number, then the square root of the number is also one of its factors.)  Comparing the square root’s floor and ceiling was the first solution that popped into my head.

I had considered just modding the number by one and comparing the result to zero, but the mod operator truncates floats before performing the modulus, so ($foo % 1) will always return zero.If anyone has a better solution, please let me know.

UPDATE: Two improvements to the method above from the comment section. Andy makes the excellent point that my factorization can start from two, since one and the number itself will always be factors. In order to make this work, I also have to then include an if to handle the boundary case for if the $numberToCheck argument is one.

Chris astutely pointed out that my ceil() check is superfluous and that I can get by with just if (floor($numberRoot) == $numberRoot).

Thank you both for your feedback.

Signal / (Noise + iPad)

This past week, I’ve been pointedly avoiding a few of my favorite tech sites and blogs, just because the iPad had put a noticeable negative spike in their SNRs.  The covetous cries of Mac-Fans everywhere are as annoying as they are unsurprising.  These people are clamoring for the chance to lay down $500+ of their own cash for an oversized iPhone with fewer features.  (Though there will be bonus Awesome Points for the first person to load Skype on the iPad and use it as a hilariously oversized phone.)

I basically see the iPad as an answer in desperate want of a question.  Netbooks are cheaper and more useful.  An iPhone is more portable and is, you know, a phone to boot.  My Motorola Droid does even the iPhone one better by being able to multitask.  There’s literally nothing that the iPad does that other products don’t already done better.  In fact, there’s nothing the iPad does that isn’t already done better by an existing Apple product.

Herein lies a beautiful object lesson about both Capitalism writ large and the tech market in particular.  It’s not about what I want or you want or any other one person wants.  It’s about what the Market wants.  The Market is a messy, emergent force caused by the desires and information of millions of actors.  And enough of those actors want this thing (and want it with a fierceness) that this underpowered hunk of white, round-edged plastic is going to make a killing.  The odds are significantly non-zero that people will be waiting in line for days to get their hands on one.

Simply put, Apple is going to make a killing on these.  They’re going to make money hand over fist.  The profits from the iPad are going to bankroll Steve Jobs’ hyperbole and black turtleneck habits for years to come.  So no matter what you or I or anyone in particular thinks of the iPad, the Market has spoken.  We may judge the iPad to be lame, but the Market will undoubtedly judge it to be awesome.

And from a business standpoint, that’s all that matters.

Architecting for Behavior

I’m currently working my way through Rohit Khare’s dissertation on using using REST methods in Decentralized Systems.  The main work of the paper is interesting, but I wanted to comment briefly on the methodology.  Khare takes the interesting approach of articulating several known problems with decentralized systems and then designing an architure which enforces graceful solutions to those problems.

“The basic organizational pattern of this dissertation is the pairing of a desired property and the architectural constraints that induce it. We design several new styles using this pattern of alternating subsections: specification of the required properties, a definition of the new style, validation that the new style correctly implements the abstract specification, and implementation issues encountered in practice.”

This architectural approach to solving difficult problems is an interesting one.  Architecture’s often given lip service, but it’s rarely treated well.  It’s often treated (by both suits and hackers) as the annoying middle step between defining what a system does and building that system.  It’s the irritating goody-two-shoes at the party that says “wait, before we pound this third round of tequila shooters, we should really figure out how we’re getting home.”

…Okay, so that analogy went kind of a weird place, but my point is that architecture is often seen as a necessary evil.  Business folks see it as yet another task in the Gantt chart that pushes back the delivery date.  Programmers see it as the busy work necessary to divy up the project amongst themselves.

(Interestingly, programmers tend to approach architecting personal projects with the same glee one might expect of a kid designing the treehouse of their dreams.  Put a bunch of coders together on a business project, however, and that joy for architecting disappears.  I imagine this indicates that how a coder feels about architecture may serve as a litmus for how they feel about the project as a whole.  I can’t decide whether this qualifies as an interesting insight or as crashingly obvious.)

I think that Khare’s approach is an elegant demonstration of the way software professionals should approach architecture.  What Khare is doing in his dissertation is articulating the desired behaviors of the system, and then designing it so that the system cannot behave in any other way. In this way, architecture becomes an integral part of specification enforcement.  (In more applied contexts, it also simplifies programming and testing the system as well.  After all, if every possible behavior that can be taken care of at the architecture level has been, then those are behaviors that you don’t have to worry about in implementation or testing.)

If you approach architecture as a way to satisfy some parts of a specification, then it becomes more obvious way software professionals should care about it.  Suits should care about architecture because, since it enforces some parts of the spec before any code is even written, it will make subsequent tasks (implementation and QA) faster.  It will also create a better and more reliable end product.

Coders should care about architecture because it makes their jobs easier and more enjoyable.  It helps them deliver a stronger product, with less risk and effort on their part.  It also helps them do all that more quickly and more consistently.

Personally, I think that both should care about architecture because it’s really damned interesting and can lead to some amazingly elegant solutions.  But I guess I can’t really demand that other people be as interested in good design as I am.  To paraphrase the amazing Randall Munroe: “Architecture: It Works, Bitches!”

return Posts[0];

I suppose the first post should be all introspective and shit, but I’m honestly not feeling it.  All I have to say is that this blog will always strive to choose Win over Fail and to engage in random acts of Awesome.  As its proprietor I make no other promises.

Let’s get hacking.

Return top

Magic Blue Smoke

House Rules:

1.) Carry out your own dead.
2.) No opium smoking in the elevators.
3.) In Competitions, during gunfire or while bombs are falling, players may take cover without penalty for ceasing play.
4.) A player whose stroke is affected by the simultaneous explosion of a bomb may play another ball from the same place.
4a.) Penalty one stroke.
5.) Pilsner should be in Roman type, and begin with a capital.
6.) Keep Calm and Kill It with Fire.
7.) Spammers will be fed to the Crabipede.