On the organization side, I think it's been excellent. Everything worked as expected, and when I've got a problem with wifi, I got it fixed literally in couple of minutes by the organizers. It was great to have sushi and burritos instead of last year sandwiches too. The slack channels were quite useful and well organized. I think the organizers deserve a 10, and that's very challenging when organizing a conference.
More on the content side, I used to attend conferences mainly for talks. But this year I decided to try other things a conference can offer (networking, sprints, unconference sessions...). Some random notes:
I think probabilistic models is the are of data science with a higher entry barrier. This is a personal opinion, but shared by many others, including authors:
It looks like there is even terminology to define whether the approach used is mathematical (formulae and proofs quite cryptic to me), or computational (more focused on the implementation).
It was luxury to have at PyData once more, Vincent Warmerdam, from the PyData Amsterdam organization. He has been one step ahead of most of us, who are more focused on machine learning (I didn't meet any frequentist so far at PyData conferences). He already gave a talk last year about the topic, The Duct Tape of Heroes: Bayes Rule, which was quite inspiring and make probabilistic models easier, and this year we've got another amazing talk, SaaaS: Sampling as an Algorithm Service.
After that, we managed to have an unconference session with him, where we could see more in detail the examples presented in the talk. While Markov Chain Monte Carlo or Gibbs sampling aren't straight forward to learn, I think we all learnt a lot, so we can finish learning all the details easily by ourselves.
There were other sessions about Bayesian stuff too:
- Bayesian optimisation with scikit-learn - Thomas Huijskens
- Variational Inference and Python - Peadar Coyle
- Bayesian Deep Learning with Edward (and a trick using Dropout) - Andrew Rowan
- Segmenting Channel 4 Viewers using LDA Topic Modelling - Thomas Nuttall
I've got good recommendations of books related to probabilistic models and Bayesian stuff, which shouldn't use the tough approach:
- Bayesian methods for Hackers
- Information theory, inference and learning algorithms
- Computer age statistical inference
- Statistical Rethinking: A Bayesian course with examples in R and Stan
There is a Meetup in London, which is the place to be to meet other Bayesians:
From theory to practice
It may be just my impression, but I'd say there have been more talks on applications of data science, and more diverse. While I remember talks on common applications like recommender systems in previous editions, I think it's been an increase on the talks on applications of all these techniques, in different areas.
To name few:
- Data science used to see the popularity of users in a Muslim dating app
- Intelligent ventilators, that make newborns breath when they need it
- Electrocardiogram analysis with time series techniques
Ok, it wasn't an interview, it was a pub quiz, but the feeling was somehow similar. 10 years working in Python, I passed challenging technical interviews for companies such as Bank of America or Google, and at some point you start to think you know what you're doing.
Then, when you're relaxed in a pub, after and amazing but exhausting day, James Powell starts running the pub quiz, and you feel that you don't know anything about Python. Some new Python 3 syntax, all time namespace tricks, and so many atypical cases...
Luckily, all the dots started to connect, and I realized that few hours before, I was discussing with Steve Holden about the new edition of his book Python in a Nutshell. Which sounded like an introduction to me, but it looks like it provides all Python internals.
Going back to the pub quiz, I think it's one of the most memorable moments in a conference. Great people, loads of laughs, and an amazing set of questions perfectly executed.
Big Data becoming smaller
As I mentioned before, my experience at the conference is very biased, and very influenced by the talks I attended, the people I met... But my impression is that the boom on big data (large deep networks, spark...) is not a boom anymore.
Of course there is a lot of people working with Spark, and researching in deep neural networks, but instead of growing, I felt like these things are loosing momentum, and people is focusing on other technologies and topics.
One of the things I was interested in, was on finding new interesting meetups. I think among the most popular ones in data science are:
- Attend PyData Barcelona 2017, which will be as amazing as PyData London, also in English, and with top speakers like Travis Oliphant (author of scipy and numpy) or Francesc Alted (author of PyTables, Blosc, bcolz and numexpr).
- Wait until the videos are published in the PyData channel (or watch the ones from other PyData conferences)
- Join one of the 55 PyData meetups around the world, or start yours (check this document to see how, NumFOCUS will support you).
- Join one of the other conferences happening later this year in Paris, Berlin, EuroPython in Italy, Warsaw... You can find all them at https://pydata.org/