Saturday, November 21, 2015

Why Python Iterators Frequently Return self From __iter__

In most intro examples for the Python Iterator protocol, you will see simplistic example classes that have an __iter__ method that just returns self. Then the class implements either the next method (for Python 2) or  the __next__ method (Python 3) that actually does the work of iterating.

The iterator protocol does this:
1. Call iter on something (which calls the thing's __iter__ method implicitly).
2. Assume you can call next (or __next__) on the result from iter, and keep doing so until something breaks.
Python for-loops and comprehensions are examples where the iter protocol is used implicitly.

If you want to make your custom class iterable, it needs the __iter__ method and it needs for whatever gets returned by the __iter__ method to also have a next / __next__ implementation. Since presumably you also want your class to have a custom behavior for whatever "next" means, you want to implement next yourself, and in order for that custom next to get called, you need the instance object (the thing referred to as self) to be returned from the iterator protocol.

This is why you see a lot of examples that look something like this

def __iter__(self):
    return self
# followed immediately by 
def next(self):
    # Do real stuff

Even though __iter__ is required in order for a class to conform to the way this protocol works, it is usually next that does all of the work, and __iter__ is a trivial formality just to make sure the instance of your custom class is the thing that gets nexted.

But, in those cases when you don't want to create your own custom next behavior, you don't have to!

You can just return some other iterator directly from __iter__, and as long as that iterator does have a next method, it will work fine:

class Foo:
    def __iter__(self):
        return iter(range(10))
[x for x in Foo()] # returns [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Why didn't you need a custom next method in Foo? Because the thing you planned to return from __iter__ already had it's own next method (in this case, the iterator implementation for lists). After calling iter(Foo()) (which is done implicitly in Python loops or comprehensions), you're no longer really dealing with anything Foo-speciifc -- you're dealing with iter(range(10)).

Hopefully this helps you think about implementing iterators more clearly. When you want to implement your own custom step-by-step iteration logic, that logic is going to live inside of a next method that your class has, and the class's __iter__ will just be boilerplate to gain access to that next method.

But when you don't want to write your own step-by-step iteration logic, such as when you just want to mimic the iteration logic of something else (like a data member of the class, perhaps), then inside your class's __iter__ you just punt to that other thing's __iter__ and assume that its implementation of next will be used -- your class will not need its own next.

But, do note that whether you implement next yourself, or you punt to the next implementation of something else, ultimately the iterator protocol must bottom out with some object that does have its own implementation of next -- and for that class where it bottoms out, the __iter__ method will basicaly always just return self.

Here's a silly example:

In [13]: import itertools 
In [14]: class list_then_dict:
    def __init__(self, l, d):
        self.l = l
        self.d = d
    def __iter__(self):
        return itertools.chain(self.l, self.d.iteritems())
In [15]: l_then_d = list_then_dict([1, 2, 3], {'a':4, 'b':5, 'c':6})

In [16]: [x for x in l_then_d]
Out[16]: [1, 2, 3, ('a', 4), ('c', 6), ('b', 5)]

Monday, November 9, 2015

Law of Rents, American Dream, and Status

A late 2014 Atlantic article about the tension between upward mobility and home ownership was making rounds on Hacker News today. The tagline of the article provides a reasonable synopsis... "The paradox of the American Dream: The best cities to get ahead are often the most expensive places to live, and the most affordable places to live can be the worst cities to get ahead." The claim is generally supported by some cursory data: cities that offer the chance for upward mobility provide extremely low access to housing that can be afforded by millennial salaries in that city. Meanwhile, in cities where a larger portion of housing can be afforded with millennial salaries, there's no chance for upward social mobility.

User 'mattgibson' makes a valuable observation:

"This is not a paradox, it's one of the best known axioms of economics: < >.
The key driver is that land values rise when earnings rise. You can't move land, so there is set amount of it within commutable distance of these cities. Also, each bit is in a unique location relative to all the other bits, so every bit of land is either better or worse than the others in terms of your earning potential when you live there. Close to the centre = cheaper and quicker commuting, whereas further out means more expensive and slower. Buying or renting a bit of it to live on is always an auction, so the best bits (where you earn most) are won by those who those who feel it's worth paying extra to get them (because they are the ones who earn the extra income by living there).
In this way, if wages rise, then the maximum bid that a person can justify paying in order to live in that city rises, and therefore property is more expensive.
This is not a free market, BTW. Land is monopoly."

Another way to say this is that what is commonly referred to as the American dream is in direct contradiction to the law of rent.

The article chooses to explain "the American dream" like this:

"The American Dream begins with a good job and place to live that you can afford." 

Generally in this context 'a good job' is taken to mean that the pay and benefits offered as compensation elevate you comfortably above the total cost of living, which is also the idea of upward social mobility. As mattgibson pointed out, if a lot of people are colocated in a region with a scarcity of minimally-acceptable housing, then the first condition of the American dream (that a lot of these people have compensation comfortably in excess of the cost to live, enough to buy the property they want) necessarily implies that prices will be bid upward just to the tipping point where all of the "comfortable excess" compensation above cost-of-living is eaten up by the elevated home or rent prices.

So the second condition of the American dream ("a place to live that you can afford") can only be fulfilled if you change your perception about what kind of home you want -- and so a lot of the article can be understood as describing a preference shift in millennials (and many others, for example I am a bit older than millennials but also face this dilemma).

Instead of preferring "normal American family homes" as have been readily available in many good-to-live-in places for about the last half century, we now have to say that such preference is no longer an affordable preference to have (even though probably most of us are unhappy that world circumstances have dictated this change in the affordability of the preference, and feel it is an overall bad sign for world economic progress). And we replace the preference with what seems to be more of a desire for social mobility, urban amenities, and proximity to very large employment hubs.

The inevitable tension between the two components of the American dream, good job and affordable house, has caught up to us, which is perfectly predictable, and in response, it seems social mobility is more preferred than preserving the sort of American 'family home' lifestyle by foregoing wages and social mobility in order to live in a more affordable area.

I expect that a lot of this has to do with status signaling and money, and probably also mate selection. The relative gains in status to pure wealth optimization seem much stronger now than before -- even if wealth was still the most dominant part of status in the past. This is anecdotal, so I could be very wrong, but it just seems to me that you could earn a lot less, but establish a family home, and still be a high-status member of a community in the past.

But now, it's just money.

It doesn't matter if you've built your own, stable life in a small town community in America, and you've done so with extremely impressive hard work, frugality, and wits, while also supporting and nurturing a family. If you do all that now, but you don't earn a high wage, you're treated as a total nobody for society to shit on and mock.

But if you spend all day doing things that are arguably bad for society in a cubicle farm in a skyscraper and go home to a tiny apartment in a city where you'll never be able to buy property, yet your salary is in the top 25% in the country, you're treated like you deserve respect and like you are a productive member of society. More or less.

Thursday, October 8, 2015

Equality and Determinism

A few weeks ago, Dylan Matthews published a fascinating Vox article, The case against equality of opportunity. Matthews argues that elites and policy-makers focus too much on equality of opportunity (ensuring everyone gets an equal starting point) and that this is inherently problematic. Instead, we should focus on equality of outcomes (everyone gets an equal, or at least minimally acceptable, ending point). Ensuring that opportunity is equal would mean Harrison Bergeron-like artificial restrictions on virtually every life choice, lest someone with slightly smarter parents, say, is given an advantage (unequal opportunity) because their parents read more to them or teach them profitable skills or simply lucked into more advantageous genetic predispositions.

Matthews provides a concise summary of his central argument:

"The motivation to work hard and make a serious effort isn't simply a personal choice. It's the result of millions of environmental and genetic factors: Did your parents push you growing up? Are you predisposed to depression? Did you go to a good school? Were you held as an infant? Did you inhale lead fumes as a child? The ability to work hard is a privilege, spread unevenly across genomes and households, with more going to the rich than to the poor. People who struggle with motivation due to factors beyond their control — be it genetics or mental illness or socioeconomic deprivation — do not deserve our scorn. They deserve our help.
Elites like to talk about effort because it justifies their own positions. It provides a non-arbitrary explanation for their wealth and privilege. It offers an excuse for elites to look out for disadvantaged people with whom they empathize, and not those with whom they feel no kinship. We look at an oft-suspended kid with a 1.4 GPA and see a delinquent. We look at a violinist with a 4.0 and see ourselves. And so we wind up helping the one who needs less help to begin with."

Why go so far as to attribute volition to elites ("Elites like to...") when you have given a perfectly good reason to feel that this is no more the "fault" of elites than the actual lack of motivation is the fault of the impoverished, namely determinism?

That is, if we are going to drop down to a hyper fine resolution to examine genetics and environmental factors that may cause impoverished people to have a higher propensity to squander opportunity, why wouldn't we also look to genetic and environmental factors to explain why elites have a higher propensity for rationalizing their success by retroactively casting their life story in terms of working hard and making good on an opportunity? If, as with opportunity squandering for the impoverished, we are all just deterministically defined by our genes and environmental drivers, why are we looking to attribute bad volition onto elites, but shield the poor from being blamed; why are we using any normative language at all?

Matthews says this (emphasis and formatting mine):

"People who struggle with (motivation) due to factors beyond their control — be it genetics or mental illness or socioeconomic deprivation — do not deserve our scorn. They deserve our help."
Why doesn't he also say this?

People who struggle with (believing that squandered opportunities can be primarily attributed to genetic and environmental circumstances rather than the volition of the poor) due to factors beyond their control — be it genetics or mental illness or socioeconomic excess — do not deserve our scorn. They deserve our help.

It seems to me that if you're going to go with a determinism argument -- effectively saying that due to genetic propensity and circumstance, many poor people cannot make good on opportunities, then you've got to support the other side of the coin as well and say that due to genetic propensity and circumstance, many successful-due-to-luck-of-genetics-and-circumstance people cannot make good on opportunities to view their own success as good luck or failures of others as bad luck -- and policy-makers are just making the policy that their genetics and circumstance has determined that they will make, and so on, and ultimately Matthews's own writing of the article was itself merely an expression of the particular genetic predispositions and circumstances that comprise him, and in the end we're all just going to argue for whatever we argue for, and something will be the winning argument (probably also decided by luck) and that's that.

Of course, I'm being a bit coy. Whether or not we have free will over our actions is a tremendous philosophical problem, and even if we don't it still feels like we do and most of us rationalize our daily experiences by internalizing them as volitional actions either on our own part or on the part of other actors and entities.  So the world isn't going to just swallow determinism and move on. But still, it begs the question of why we are OK with arguments from determinism that admonish the rich and protect the poor, but we're not OK with determinism that simply says admonishing or protecting anyone at all is silly.

As you might guess, my feeling is that is has a lot to do with Hansonian status signalling, particularly well-summarized in Unequal Inequality, Inequality Talk is About Grabbing and Sex Prizes.

Tuesday, September 1, 2015

Married to My Birthday

In honor of turning 0x1E today, I thought I would share with you my favorite web comic, Married to the Sea. In particular, I've curated the set of comics appearing on my birthday going back to the beginning of the comic in 2006. And at the bottom I included my favorite, even though it's not from my birthday, just for fun.

2006 - Beautiful Country

2007 - Cash Croppin in Kentucky

2008 - Fixer Upper

2009 - Gentrification Puppy

2010 - Books You Wanted

2011 - If You're Listening, God

2012 - Back on the Highway

2013 - What Would the Smurfs Do

2014 - Smoke Skulls

2015 - Boarded Up Window

My personal favorite:

What Are These Called

Tuesday, July 21, 2015

Interactive, Timed Programming Tests Make Little Sense

I've noticed a trend in my performance on programming and math interviews. Any time I have to do this sort of thing in front of someone else (in person, online, or over the phone), my solutions tend to be mediocre or worse, unless I am presenting a prepared solution. Yet any time I am given space and solitude to solve a long-form programming test I tend to do exceptionally well, even if it involves giving a presentation or other type of verbal explanation of the solution. And I almost always solve things both faster and more robustly when I am given solitude than when I must type out a solution in front of someone else.

I'm starting to wonder if the social aspect of coding or solving problems in front of someone is the main factor. In a job interview with someone you have never met before, when you're an introvert who is worried about how you come off socially, not saying the wrong thing, and so forth, it's like your brain RAM is taken up with all of these extra social processes, leaving less bandwidth for calling up your creative ability to solve some problems or even basic memory recall. I wonder if anyone has done experiments on whether interviewees exhibit better memory recall during time-constrained interviews or time-unconstrained ones, or with explicit vs. implicit indicators of time constraints, or when solving problems alone or in front of an evaluator.

No one likes to be seen making a mistake, so I have to believe that lots of little social biases creep in that throw off your whole way of working. When you're solving problems in the environment you find comfortable, you are at ease. You can make a mistake and it's ok. And that freedom lets your mind wander to the solution without worrying over artificial constraints, such as whether you're misspelling words in the documentation, or you type too slowly -- things that utterly don't matter from a technical point of view.

Obviously, an employee has to function in front of colleagues. But if you think about it, performing one's programming job never resembles the type of coding or programming that is tested during these interactive interviews. If you are given some kind of data structure trivia or a tricky probability riddle, then solving the problem really is all about silent focus, enumeration of the problem's details and constraints, and then diffing them against what you know about the problem already. In real work, this is when you close your office door, put on headphones, or take your laptop to the quiet room. You might have to do this recursively, you might try something, stumble and go back, and you might first provide rough sketches of solutions before going back to refine the bookkeeping details and corner cases. But the point is that you rarely, if ever, choose to do this with your friend or boss sitting at your desk with you. It's even less common for the nature of the work to require that the solution be dreamt up and executed in front of someone else. 

Even in a collaboration-intensive field like quant finance, which can often have tight intra-day deadlines, I never experienced any need to solve problems in front of other people. It was purely about receiving instructions from others, asking clarifying questions to nail down the needs and scope of the problem, executing the solution in solitude, and then coming back together with the project stakeholders and team mates to explain/share the solution and to consume explanations and solutions from those people for my own work too. In fact, I struggle to imagine how such busy intra-day deadlines could be dealt with otherwise. If you insisted on lots of pair programming and interactive solution-watching, your team's ability to solve multiple problems asynchronously would plummet.

It seems to me that all of the major traits of the main work for software problem solving are not social tasks. You don't do those things with audible narration on your first pass over the problem -- not even when you're doing something like pair programming. Which means that these sorts of problems are almost perfectly anti-correlated with someone's performance in a short, timed, socially awkward setting like an interactive coding session online or a phone interview. If my description is correct, we should expect to see a weak relationship between timed coding test performance and long-term job performance ... possibly even a negative relationship between the two if the skills that lend themselves to being capable of solving things auditorally anti-correlate with the skills needed for the more methodical, patient, creative process of long-term software problem solving.

If you contrast all of this with long-form code tests, the differences are stark. You'll still need to access knowledge about data structure trivia to approach the long-form solution, but you'll do it with whatever degree of solitude personally works for you. You won't do it interactively with a total stranger while being neurotically worried about how you sound (unless that's your thing). And, what's really nice, you'll still have to eventually talk to people and explain your solution. If you submit a bunch of mysterious code and you can't give a great account for how it works or why you made certain choices, the evaluators are going to be suspicious that you had outside help, or that your communication skills are not sufficient for the role, or that you don't truly understand how a certain approach works under the hood even if you recall some superficial details about it.

So with the long-form tests you get the best of both worlds. The programmer can write code in an environment suitable to his or her personality, without awkward social pressure brain processes being activated at the same time they are trying to solve a problem. And you can still probe them for any kind of explanatory communication you need to ensure that they really solved the problem and that they know how to relate the solution to others. You are measuring the kind of skills that actually matter on the job, while still having plenty of opportunity to assess communication skill.

One other benefit is that the solution process is not degraded by miscommunication. Often during timed, interactive programming tests, the assumptions or expectations are not clear to both parties. I once worked on an interactive interview that involved writing unit tests. However, the problem itself involved a hypothetical class that had not been written by anyone, and I was instructed to pretend that it had been implemented, with a particular set of attributes and methods, and to build my testing solution around that. None of it was documented in the problem statement beyond the name of the class to use.

But when it came to writing the unit tests, everything hinged on exactly how the class was implemented. I kept asking what kinds of things I could assume about the hypothetical class-whose-methods-mattered-for-testing, and the interviewer kept giving terse answers that did not address my questions. Likely we were talking past each other (I mean, we were complete strangers who had never worked together and only just been introduced 25 minutes earlier, trying to collaborate over a shared editor screen). But it easily caused 10 or 15 minutes of wasted time, dead air, re-asked questions, and so forth, all of which could have been avoided if the problem had been given long-form, perhaps even requiring me to do more work by implementing both the hypothetical class and the unit tests depending on it. Of course there might still be questions, but they can be rounded up after hours of careful thought, and then shared asynchronously by email so that no one is pressured in real-time to accept inadequate answers due to people talking past one another. 

As it turned out, however, my interviewer likely walked away from the assessment with a negative feeling about my ability to solve that sort of problem, and I walked away with a negative impression about the communication abilities of the other workers in that company. Both conclusions were probably wrong, but you can't unfeel that kind of awkward interview.

I'm sure that timed tests still have a place, I'm just not sure what that is in the world of programmer or quantitative researcher evaluations. Maybe they have to come out when the lower administration cost and shorter turn-around time for feedback are crucial, such as for extremely large firms that evaluate many candidates. Sadly, the test will be biased to favor extroverts whose natural thinking processes are not significantly disrupted by the social norms of a formal conversation with a stranger. Even in the best case they will be subject to degradation just due to common verbal misunderstandings, assumptions, clarifications, etc., in the presence of the time constraint. And overall this will have a population effect on the culture and type of programmer who will aggregate into companies that perform these sorts of evaluations.