This morning I came across this post by Kelly Norton. He calculated the number of ‘pleasant’ days for each US zip-code area. California seems to win the race with more than 180 ‘pleasant’ days each year. A pleasant day is defined by the min- and max temperature not exceeding certain limits.
I also had some hourly weather-data lying around and decided to try the same approach. My persoanl ‘pleasant’ day is defined as being between 15 and 28 ºC. Apart from having a ranking of pleasant places, this is also a great application for Mongo’s aggregation pipelines.
[cc lang=“python” width=“100%“tab_size=“4” lines=“40” noborder=“1” theme=“dawn”]
res = db.weather.aggregate([
{'$match': {'_id.datetime': {'$gte': dt(2013,7,1)} } },
{'$group' : {
‘_id’: {
‘city’: ‘$_id.location’,
‘day’: {'$dayOfMonth': ‘$_id.datetime’},
‘month’: {'$month': ‘$_id.datetime’},
‘year’: {'$year': ‘$_id.datetime’},
},
‘min’: {'$min': ‘$tempm’},
‘max’: {'$max': ‘$tempm’}
}
},
{'$match': {‘min’: {'$gt': 15} } },
{'$match': {‘max’: {'$lt': 28} } },
{'$group': {
‘_id’: ‘$_id.city’,
‘n’: { ‘$sum’: 1}
}
},
{'$sort': {‘n’: -1}}
]);
df = pd.DataFrame(list(res[‘result’]))
[/cc]
This gives us a sorted list with the number of ‘pleasant’ days since mid-2013. Please note that this only accounts for temperature. Other environmental factors, like air pollution or humidity are not taken into account.
Data source: Wunderground API