Sunday, December 17, 2017

My NIPS write up

Just as a quick disclaimer, this post is about my personal experience and opinions at NIPS 2017, and I'm not an AI researcher, I work as a data scientist in the industry. For a more technical summary of the talks and papers presented, you may want to check this document by David Abel.

Deep learning rigor and interpretability

This is quite a controversial topic, but this is how I see it. There are two main approaches to the idea of statistics/learning:
  1. Understand how learning works, and replicate it based on this understanding
  2. Focus on results, no matter if it's at the cost of poor understanding
I think these two approaches were first dividing statisticians and machine learning practitioners, as Leo Breiman describes in The two cultures. And in a similar way, today it divides the Deep learning school, which is somehow winning in terms of results, from other techniques.

My view on deep learning is that we've managed to understand in a general way the how the human brain works. Not why, but with the research of people like Santiago Ramon y Cajal, Camilo Golgi, Donald Hebb..., we know that it's a network of neurons, and that the "intelligence" is on how the neurons connect, and not in the neurons themselves.

With the research of Warren McCulloch, Walter Pitts, John Hopfield, Geofreey Hinton..., we can replicate this structure of neurons in an artificial way. Just with a set of connected linear regressions, with activation functions to break the linearity. And with current computation power, including optimized hardware like GPUs, we can implement networks of neurons at a huge scale. We know that the model works, because it works for the human brain, and we're confident it's the same. But we don't know how each neuron is connected in the brain (how much signal it needs to receive from the other networks to activate), so we miss the weights of the linear regressions.

With techniques like backpropagation, stochastic gradient decent... we can optimize the weights to make useful things, like image or sound recognition and generation.

So, how I see it, the main question is:
  • Does it matter the rigor, how much we understand about what we do, how much we understand our models and their predictions? Or we just care about minimizing the out of sample error?
This may be a free interpretation of what was being discussed at NIPS, for example at Ali Rahimi's talk, or at the interpretability debate. It was interesting to see how excited people was about the debate, and the "celebrities" on the stage:

I think someone important was missing from the debate, and it's what Chris Olah and Shan Carter describe as research debt. Like in software, it's not only important what do you have today. It's important what will you have in the future. The best the internal quality of your software, the easier will be to improve it and add new features in the future. I think every good sofware engineer is aware of how important is to keep technical debt under control. But I don't think most researchers are aware that our understanding of the research today, is key for future research.

So, in my opinion, it's not that important that with deep learning we can have state of the art results in many areas. I don't think we'll have much better results in the future, unless we focus on quality research, and not just trying random things to get a small increase in the model accuracy.


I think Generative Adversarial Networks were by far the most popular topic at NIPS. I'm not sure how many talks Ian Goodfellow gave, but it don't think it wasn't far from one every day. And it was all sort of applications of GANs, including many for creativity and design. We're not yet in the point of being able to generate arbitrary images with high definition, but it doesn't seem it'll take that long to have even more impressive results than what we've already seen. One of the most discussed articles was the GAN that generates celebrity faces.

Bayesian statistics

Bayesian statistics was also very present during the whole NIPS. Many times together with deep learning, like in the Bayesian deep learning and deep Bayesian learning talk, the Bayesian deep learning workshop, or the Bayesian GAN paper. Gaussian processes and Bayesian optimization was also present from the tutorials, to the workshops.

Surprisingly to me, most of the papers presented about multi-armed bandit problems were based on frequentist statistics. And I say surprisingly, because I think the industry is mostly adopting Bayesian methods for A/B testing, one of the main applications. In my opinion Bayesian methods are much simpler and intuitive, and tend to offer better results. One of the hot topics in this area is lowering the false discovery rate in repeated tests. And many paper about contextual bandits were also presented, and are that I discovered at NIPS.

Reinforcement learning

RL was the last of the main topics that kept repeating during the whole NIPS, if I'm not missing any. Both based on the classic q-learning, or by using deep learning representations.

Other topics

There were a couple of other topics that I found interesting, and that they were new to me:
  • Optimal transportation
  • Distribution regression
A great talk, but not because of the technical content, was the "Improvised Comedy as a Turing Test", where two researchers and comedian performed improvised comedy with a robot implemented by them:

About the conference

It was the first time for me attending an academic conference, and some things weren't very intuitive, being used to open source of business conferences. This is a random list with my thoughts:
  • I found the location quite good:
    • Near to a main airport, so I could fly directly from London
    • Good temperature
    • Many hotels nearby
    • English speaking country
    • The only problem with the location was that people from several countries (e.g. Iran) were banned from attending, as the organizers mentioned in the home page of the conference
  • I found the use of an app to communicate during the conference quite convenient. Even if the app had some obvious flaws, like the mess with the list of discussions, it added a lot of value
  • I found it difficult to know what to expect about food. I think in all previous conference I attended (and they are not few), breakfast and lunch was provided. At #NIPS it was advertised in the schedule that breakfast wasn't offered first time in the morning, no other mention. Then, breakfast was provided later in the morning (one day the breakfast was obviously decided by an algorithm). Lunch wasn't provided, and dinner was provided, but in a different undisclosed location in the venue. One day dinner was provided twice (the regular, plus a voucher for a food truck, only valid that day for dinner).
  • The sponsors were quite interesting. Not only because I managed to get up to 10 t-shirts (including one with Thomas Bayes face), but because I've got very interesting conversations with many people at the booths. I found it interesting the diversity of countries represented in the sponsor area. While one could expect that Silicon Valley companies could eclipse the rest, the number of Chinese and English companies was at the same level, and some other countries represented, like Canada or Germany. One of the fun things on the sponsors sections were the live cameras performing predictions or style transfer:

  • Compared to open source conferences, I found the atmosphere at NIPS very different. May be it's by the nature of research and open source, but my experience is that open source conferences have a very collaborative environment. You don't necessarily need to like or use someone else's project, to have a friendly discussion or appreciate his contribution. But I felt research quite a competitive environment. More than once I saw people in presentations or posters addressing the presenter in a not very nice way. Challenging their research, trying to point out that they know better. I think providing constructive feedback is always great, but I found sad this feeling of mine (that may be biased by just the few examples I saw) that researchers see each others more as rivals, than as part of a community that delivers together.


On the systems part (mainly in the workshop), it was very interesting to see the talks about the main tensor software from the big companies at Silicon Valley:
On the fun side, TensorFlow presented their eager mode, and Soumith Chintala mentioned that "PyTorch implementes the eager mode, before the eager mode existed". And some time after he mentioned that PyTorch will implement distributions soon, the way TensorFlow does. So, the main innovation from each project, is copied from the competitor. :)

Tensors aside, the star of the ML Systems workshop was Jeff Dean. He discussed TPUs, and how Google is creating the infrastructure for training deep learning models. The interest in Google, deep learning and Jeff Dean was maximum, and the room was as crowded as a room can be. Some time before the talk, I had the honor to meet Jeff Dean, as the picture proves:

On the more pragmatic part, it was interesting to see the poster about CatBoost, Yandex's version of gradient boosting trees. I found the ideas in the paper quite interesting. There are different novel parts compared to xgboost. I spent a bit of time testing if the results were as good as presented, but the documentation is not yet as good as could be, and the API a bit confusing, and I finally gave up.

One of the most interesting insights from NIPS, wasn't actually presented. It was in a discussion with Gael Varoquaux, core contributor of scikit-learn. I wanted to talk with him about scikit-learn, and see if we could help with its development as part of the London Python Sprints group. But given the current state and the nature of the project, that doesn't seem very useful at this point (See this comment for clarification on this). But what it was interesting about the conversation, was to discover the new ColumnTransformer. While it's not yet merged, a pull request already exists to be able to apply sklearn transformers to a subset of columns. At the moment sklearn doesn't provide an easy way (or a way that you can understand your models later), and I think most of us were implementing this ourselves in our own projects.

A sad story

To conclude, I want to mention not something that I experienced myself at NIPS, but that many of us read later on, and it's Kristian Lum story about sexual harassment in research. Hopefully all this wave of scandals is the beginning of the end, from English politicians, to Hollywood... And it may not be fair, but while equally disgusting as all the other cases, I found it more surprising in research. That the brightest minds in their fields have been abusing and abused, is something that I find more shocking than in an industry like Hollywood.

The second part of the story, this one with names, came not much later, in this Bloomberg article.

On a positive note, I think the problem is not that difficult to solve. In the Python community I think we've got all the mechanisms in place in order to avoid these problems as much as possible. With strict codes of conducts, to whistleblower channels in conferences like EuroPython, to a friendly and inclusive environment. The paradox is that the proportion of female attendees in Python conferences is much smaller than what I saw at NIPS. I'd bet a large number of women should make these cases less likely.

I hope the example of Kristian is not only useful to fix this specific case, but also to make it easier for other people to speak up, and finish with this forever.

Friday, October 13, 2017

Assigning yourself to a GitHub issue

Contributing to open source is one of the most rewarding experiences one can find. Just finding a bug or a new cool feature of a widely used library, working on it, and sharing it with the rest of the users. This is how open source has become so great and so widely used.

The workflow just described is relatively simply at a small scale, but can become trickier when many people is working in the same project at the same time.

One idea I have in mind, is to create a macro-sprint, where many Python user groups of all around the world sprint on improving Pandas documentation. Pandas documentation isn't bad, but it could easily be improved by adding more examples to the DataFrame and Series methods for example. An example of page that could be improved by adding examples is the Series rmul method.

To organize this, every sprinting team could get a subset of methods. For example, one of the teams could work on the Series conversion methods. This is a bit tricky, but even with a simple online spreadsheet with all the method categories, we could assign each to a group.

Then, in a sprint with 20 people, working in the same methods, we would create another spreadsheet with each method, and every programmer could assign himself to the method he wants to work on. So, nobody else works on it, which would end up in a lot of wasted time and duplicated work.

But of course, this is very tricky. In a coordinated sprint, working on something very structured like Pandas methods could work. But sounds ridiculous that each project has a spreadsheet with the list of issues, so every programmer can let the others know what she or he is working on.

This was a solved problem 10 years ago when I was quite involved with the Django community. At that time, Django was using Trac to manage the tickets. And every ticket had an "Assigned to" field, where a programmer could let others know that they shouldn't work on it without talking to her or him first.

What is this an issue today? While there are few companies that did as much as GitHub for the open source community, I think they made a big mistake. GitHub also has the "Assigned to" field, but this can only be edited by core developers of the project.

Core developers are surely one of the bottlenecks of every open source community. Coming back to Pandas, there are at the time of writing this post, 100 open pull requests. So, it doesn't seem a good idea, that every time you want to work on an issue, you need to bother a core developer, so she or he assigns the ticket to you.

Is this affecting the open source community? It's difficult to tell, but if we compare the number of assigned tickets in Pandas and Python, we can see how Pandas has 2,039 open issues, but only 30 of them are assigned (I bet all them to core developers).

In comparison, if we check the Python bug tracker (Python uses GitHub for the code, but not for the issues), we can see that around 50% of the tickets seem to be assigned to someone.

It's difficult to tell what's the effect in code contributions, besides in ticket assignment, but it's reasonable to think that GitHub is discouraging users from contributing, by not letting them assign issues to themselves.

As shown in this thread, npm creator requested this feature in 2013. 4 years later, there are many +1's in this unofficial ticket (it's not a ticket for GitHub developers, it's for the creator of npm himself, to keep track of his request to GitHub). But the feature is still missing.

Why GitHub is against, or has no interest, in a feature so obviously needed to have a healthy open source community is a mystery to me. But if you feel like I feel, please let GitHub support know.

Tuesday, May 9, 2017

PyData London 2017, write up

This is a post about my experience at PyData London 2017. About what I liked, what I learnt... Note that having 4 tracks, and so many people, my opinions are very biased. If you want to know how your experience would be, it'll be amazing, but different than mine. :)

On the organization side, I think it's been excellent. Everything worked as expected, and when I've got a problem with wifi, I got it fixed literally in couple of minutes by the organizers. It was great to have sushi and burritos instead of last year sandwiches too. The slack channels were quite useful and well organized. I think the organizers deserve a 10, and that's very challenging when organizing a conference.

More on the content side, I used to attend conferences mainly for talks. But this year I decided to try other things a conference can offer (networking, sprints, unconference sessions...). Some random notes:

Bayesian stuff

I think probabilistic models is the are of data science with a higher entry barrier. This is a personal opinion, but shared by many others, including authors:

The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chapters of slow, mathematical analysis. The typical text on Bayesian inference involves two to three chapters on probability theory, then enters what Bayesian inference is. Unfortunately, due to mathematical intractability of most Bayesian models, the reader is only shown simple, artificial examples. This can leave the user with a so-what feeling about Bayesian inference. In fact, this was the author's own prior opinion.

It looks like there is even terminology to define whether the approach used is mathematical (formulae and proofs quite cryptic to me), or computational (more focused on the implementation).

It was luxury to have at PyData once more, Vincent Warmerdam, from the PyData Amsterdam organization. He has been one step ahead of most of us, who are more focused on machine learning (I didn't meet any frequentist so far at PyData conferences). He already gave a talk last year about the topic, The Duct Tape of Heroes: Bayes Rule, which was quite inspiring and make probabilistic models easier, and this year we've got another amazing talk, SaaaS: Sampling as an Algorithm Service.

After that, we managed to have an unconference session with him, where we could see more in detail the examples presented in the talk. While Markov Chain Monte Carlo or Gibbs sampling aren't straight forward to learn, I think we all learnt a lot, so we can finish learning all the details easily by ourselves.

There were other sessions about Bayesian stuff too:

And probably some others that I'm missing, so it looks like the interest on the area is growing, and PyMC3 looks to be the preferred option of most people.

I've got good recommendations of books related to probabilistic models and Bayesian stuff, which shouldn't use the tough approach:

There is a Meetup in London, which is the place to be to meet other Bayesians:

Frequentist stuff

<This space is for sale, contact the administrator of the page>

Topic modeling and Gensim

Another topic that it looks like it's trending is topic modelling, using vector spaces for NLP, and Gensim in particular. Including Latent Dirichlet allocation, one of the most amazing algorithms I've seen in action.

We also got a Gensim sprint during the conference, and we could not only learn about what Gensim does, but also why is a great open source project. In the past I could see how Gensim was able to answer the most similar documents immediately, in a dataset with more than one million samples. While the documentation gives many hints on how Gensim was designed with performance in mind, it was a pleasure to participate in a Gensim sprint, and see the code and the people who make this happen in action.

Amazing also to see how Lev Konstantinovskiy managed to run a tutorial, a talk, a sprint and a lightning talk, during the conference.

From theory to practice

It may be just my impression, but I'd say there have been more talks on applications of data science, and more diverse. While I remember talks on common applications like recommender systems in previous editions, I think it's been an increase on the talks on applications of all these techniques, in different areas.

To name few:
Also, the astronomy/aeroespace communities look to be quite active inside the PyData community

Data activism

Another area which I'd say it's growing is data activism. Or how to use data in a social or political way. We got a keynote on fact checking, and another about analyzing data for good, to prevent money laundry with government information.

DataKind UK looks to be the place to be, to participate on this efforts.

Pub Quiz

That awkward moment when you thought you knew Python, but James Powell is your interviewer...

Ok, it wasn't an interview, it was a pub quiz, but the feeling was somehow similar. 10 years working in Python, I passed challenging technical interviews for companies such as Bank of America or Google, and at some point you start to think you know what you're doing.

Then, when you're relaxed in a pub, after and amazing but exhausting day, James Powell starts running the pub quiz, and you feel that you don't know anything about Python. Some new Python 3 syntax, all time namespace tricks, and so many atypical cases...

Luckily, all the dots started to connect, and I realized that few hours before, I was discussing with Steve Holden about the new edition of his book Python in a Nutshell. Which sounded like an introduction to me, but it looks like it provides all Python internals.

Going back to the pub quiz, I think it's one of the most memorable moments in a conference. Great people, loads of laughs, and an amazing set of questions perfectly executed.

Big Data becoming smaller

As I mentioned before, my experience at the conference is very biased, and very influenced by the talks I attended, the people I met... But my impression is that the boom on big data (large deep networks, spark...) is not a boom anymore.

Of course there is a lot of people working with Spark, and researching in deep neural networks, but instead of growing, I felt like these things are loosing momentum, and people is focusing on other technologies and topics.

Meetup groups

One of the things I was interested in, was on finding new interesting meetups. I think among the most popular ones in data science are:

But I met many organizers of other very interesting meetups at the conference:

To conclude, there are couple of tools/packages I discovered, that seemed everybody else was aware of.

It looks like  at some point, instant messaging of most free software projects moved from IRC to gitter. There you can find data science communities, like pandas, scikit-learn, as well as other non data science, like Django. 

A package that many people seems to be using, is tqdm. You can use it over an iterator (like enumerate), and it shows a progress bar while the iterations is running. Funny, that besides being an abbreviation of progress in Arabic, i's an abbreviation for "I want/love you too much" in Spanish.

What's next?

Good news. If you couldn't attend PyData London 2017, or you didn't have enough of it, there are some things you can do:
  • Attend PyData Barcelona 2017, which will be as amazing as PyData London, also in English, and with top speakers like Travis Oliphant (author of scipy and numpy) or Francesc Alted (author of PyTables, Blosc, bcolz and numexpr).
  • Wait until the videos are published in the PyData channel (or watch the ones from other PyData conferences)
  • Join one of the 55 PyData meetups around the world, or start yours (check this document to see how, NumFOCUS will support you).
  • Join one of the other conferences happening later this year in Paris, Berlin, EuroPython in Italy, Warsaw... You can find all them at

Thursday, May 12, 2016

PyData write-up

This last weekend I went to my third PyData, the one in London, and it's been such a great experience.

Before, I went to PyData Amsterdam, and PyData Madrid, also this year.

After the three conferences, which were very similar, but quite different at the same time, I just wanted to share what I liked, and what in my opinion could be improved. I hope future organizers can find some useful information from my ideas and thought. And that includes my future self, for when I'm an organizer.


I wasn't involved that much in the organization, but my believe is that more should be delegated. I couldn't see it that much in Amsterdam, but organizers of both Madrid and London looked extremely exhausted at the end of the conference. May be I'm too optimistic, but I'd say that more people would like to help. I think a good idea is probably to find volunteers for specific tasks. For example, probably some people would be happy to help in the registration. And organizers would have more time for other things, and to rest.

Event hosts

I think in the three conferences there were amazing hosts (the people who gave the welcome speeches, closing notes...). Vincent and the Italian guy (sorry for not remembering your name if you read this) in Amsterdam,Guillem in Madrid, and Ian and Emlyn in London. I think the whole conference makes a difference having hosts with great humour and communication skills.

I think communication is quite important during the conference. In Madrid was great (and somehow easy), because it was only a single track, so organizers could provide any information between talks to all attendees (where the beers will be, to remind people to sign up for lightning talks...). In Amsterdam with 2 tracks they managed it very well.

In London, I think the communication could be better. With 4 tracks it gets much more challenging, but I think just a bit more of communication was needed, like reminding about the lightning talks, reminding about the tweeted photos contest...

I personally didn't like that much slack (was my first time using it). The mobile version (the web, not the app) is not very intuitive, and I had problems to find the channels. I prefer twitter to be honest.


I met really great people at all conferences. I don't think other industries have the great community as PyData (and also Python) does. I didn't see anyone trying to sell their product, but it was more about sharing, and getting to know what others do. I really like that.

I'm not sure if it's just my perception, but I think in London the breaks (breakfast, lunch...) were much shorter. I think London was the conference with a higher number of proposals among the 3, so they tried to accommodate the maximum number of talks, but I personally would prefer to have more time for networking, even if that means few less talks.


Good keynotes in general. Of course no every PyData is lucky enough to have a keynote from Travis Oliphant, or WesMcKinney, but the level was quite good.

There were just a couple of things I couldn't understand (neither the people I talked to about):

  • In Madrid, Jaime (a numpy core developer) talk had to be a keynote. Even if there were already too of high level Christine and Francesc, I think people need to know that a talk from Jaime (an amazing one btw), is not the same as the one I did.
  • In London, the opposite, I couldn't see why Tetiana talk was a keynote. I won't say that the talk was bad, it was all right, but not at the level of Travis or Andreas for sure, and IMO it had to be a normal talk, and there had to be other talks at the same time as her talk

Very good level. Of course there are some talks better than others, but in general I was quite happy with most of them.

As they are (or will be) in youtube, here you have the ones I liked more:

Lighning talks

To me, lightning talks are probably the best of a conference. I really like that in Madrid they had lightning talks both on Saturday and Sunday.

And for me, it was a mistake to have the lightning talks on Sunday both in Amsterdam and London. First, because people from abroad usually have to miss the end of the conference. And also, because it's great for networking to see all the lightning talks on Saturday, and be able to talk to the speakers on Sunday if you share the same interest.

So, IMO, at the end of both days is the best, on Saturday if just one of the days.

Unconference presentations

This is very biased by my personal experience, but I think the unconference presentation format was a failure. For what I could see it worked well for the workshop Vincent gave, because he was one of the speakers, and could tell about his workshop to a large audience. But for the rest, I don't think the majority of the attendees knew about that was in that slot.

To my talk about machine learning for digital advertising, just 4 people attended. I want to believe, that if the title of the presentation was on the schedule, many more people would have attended. So, in my opinion, if unconference presentations are present in future conferences, the online schedule should be updated, and a (big) board with what is going on in that track, should be present.


Comparing the three conferences, I think the food was much better in Amsterdam than in Madrid or London. In Madrid they got special meals for people who requested them (vegetarian, allergies...), I don't know in the other conferences. It's difficult to say if it's better to spend more money in better food, of course people like better food, but also cheaper tickets, and higher contributions to free software projects.

What I could see is that more people decided to go to restaurants in Madrid and London than in Amsterdam. Ok, in Amsterdam there weren't any restaurants around, but I think better food is better for networking. The best is probably to find a good sponsor that pays for nice food, but that looks tricky. So, I think all options are all right.


The whole experince of PyData 2016 it's been amazing. Exhausting (specially the ones I had to take flights to go), but amazing, and really worth.

The organizers have done an amazing job, the local communities, and for what I could see and hear, the ones from NumFOCUS.

Now I have a beautiful laptop full of stickers, and several PyData T-shirts.

There are few minor things that in my opinion could be improved, to make the conference even better:

  • More time for networking
  • More communication from the organizers (telling all the time what is going on, sign up for lightning talks, unconferences, problems with the wifi, beers planed, community announcements, and even the smaller things)
  • More lightning talks
  • Labelling as keynotes the talks that really make a difference
Thank you very much to all the people that made them possible, and see you again there next year!

Wednesday, December 23, 2015

After Fedora installation tasks

What do I do after installing Fedora 23 MATE-Compiz?

  • Install Google Chrome
  • Merge both panels to the bottom, and auto-hide it
  • Change mouse setup to allow touchpad click and double finger scroll
  • Change look and feel setup to select window when the mouse moves over it
  • Disable screensaver
  • Change terminal shortcuts
  • sudo dnf update
  • sudo dnf groupinstall "Development Tools"
  • sudo rpm -ivh
  • sudo dnf install vim-enhanced git vlc gimp inkscape unzip
  • Install Anaconda
  • Copy my settings files: .vimrc .gitconfig .ssh
  • Add aliases to .bashrc:
    • alias vi="vim"
    • alias rgrep="grep -R"
  • In Power Management, set up the computer to blank screen when laptop lid is closed

Tuesday, December 22, 2015

Jupyter environment setup

This is a short note about how I set up my "data scientist" environment. Different people have different tastes, but what I use, and what I set up is:

  • conda for environment and package management (equivalent to virtualenv and pip to say)
  • Latest Python (yes, Python 3)
  • Jupyter (aka IPython notebook)
  • Disable all the autocomplete quotes and brackets stuff, that comes by default with Jupyter
  • Set the IPython backend for matplotlib
So, we download Anaconda from: (Linux 64 bits, Python 3, in my case). We install it by:


We can either restart the terminal, or type the next command, so we start using conda environment:

. ~/.bashrc

We can update conda and all packages:

conda update conda && conda update --all

Then we create a new conda environment (this way we can change package versions without affecting the main conda packages). We name it myenv and specify the packages we want (numpy, pandas...).

conda create --name myenv jupyter numpy scipy pandas matplotlib scikit-learn bokeh

We activate the new environment:

source activate myenv

Now we have everything we wanted installed, let's change the configuration.

We start by creating a default ipython profile.

ipython profile create

Then we edit the file ~/.ipython/profile_default/ and we add the next lines to make matplotlib display the images with the inline backend, and with a decent size:

c.InteractiveShellApp.matplotlib = 'inline' c.InlineBackend.rc = {'font.size': 10, 'figure.figsize': (18., 9.), 'figure.facecolor': 'white', 'savefig.dpi': 72, 'figure.subplot.bottom': 0.125, 'figure.edgecolor': 'white'}

To disable autoclosing brackets, run in a notebook:

from import ConfigManager
c = ConfigManager()
c.update('notebook', {"CodeCell": {"cm_config": {"autoCloseBrackets": False}}})

Monday, January 19, 2015

Google Earth on Fedora

Installing Google Earth in Fedora is trickier than it should. Here is a short HOWTO:

  • Download 64bits Fedora version from Google Earth site
  • sudo yum install google-earth-stable_current_x86_64.rpm
  • OOOPS!!! You got file /usr/bin from install of google-earth-stable- conflicts with file from package filesystem-3.2-28.fc21.x86_64

rpm has an error, we need to fix it. We'll rebuild the rpm fixing the error with rpmrebuild

  • sudo yum install rpmrebuild
  • rpmrebuild -ep google-earth-stable_current_x86_64.rpm
  • A text editor with the spec file (rpm configuration file) is opened, you need to delete the line %dir %attr(0755, root, root) "/usr/bin"
  • rpmrebuild will ask for confirmation and inform about the path of the generated rpm, just install it
  • sudo yum localinstall ~/rpmbuild/RPMS/x86_64/google-earth-stable-

Now, the application is succesfully installed, but sometimes crashes when started. It looks like the best to it is to install the 32 bits verion, or Google Earth 6 (latest is 7 at the time of writing this post). Unless you need any specific feature from version 7 I recommend installing version 6 rather than the 32 bits version of 7. The latter requires many dependencies, and it's still buggy on Fedora.

More info: