GP update 2015 05/11

(email to my fellow researchers)

Been coding 4-8 hrs every day for two weeks (since the GPU workshop). I feel I am making a snails-pace of progress, but at the same time learning a lot, step-by-step, piece-by-piece.

In the past 72 hrs I have migrated all functions into a Class (library), the foundation of proper Object Oriented Programming. Thuso stopped by yesterday and got me over a hurdle. Robert gave me some pointers as well. Mostly motivation. All coding remains at the tips of own fingers.

My code is clean, modular, and scalable. All functions are now ‘gp.[method]’ enabled. Once I get the internal message passing working properly, building a GP will be as simple as passing a few variables from one method to the next. In essence, I am creating a GP platform.

This may seem a bit overboard, but it sets the foundation for much simpler movement into the next steps: mutation, cross-over, and reproduction. I have merged what were two scripts into a single base which enables both Classify and Symbolic regression functionality, auto-selecting the associated dataset from the files/ directory.

The Tournament selection is done and tested. It randomly picks n Trees from the total population, selects the one with the highest fitness, and stores that in a list. Once for reproduction and mutation. Twice for cross-over.

Mutation is drafted. Need to call the methods in the right order.

Back to work …

kai

By |2017-11-25T00:03:38-04:00May 11th, 2015|Ramblings of a Researcher|Comments Off on GP update 2015 05/11

GP update 2015 04/14

(email to fellow researchers)

The GP Evaluation section of my code is complete.

I don’t want to admit to how simple this was, but after a few sleepless nights I woke this morning with a solution to synchronise the variables created by the SymPy eval function with the columns in the original data.csv

My code now draws the variables from the first row of data.csv (instead of a sep file) and auto-walks through every row for each tree, comparing the output of the randomly generated polynomial against the desired solution.

For my very simple test, I built a .csv whose solution for every row is the sum of all the numbers, as follows:

   a,b,c,s
   1,2,3,6
   4,5,6,15
   7,8,9,24

And by random luck, my third run of a single tree came up with the solution. However, it has not happened again, since :)

I can, as of now, draw from *any* .csv file, including the SKA data set. While it would not produce a valuable answer, it feels good to be real-world capable.

Now I am adding a flag for each tree that succeeds. Next, I will add support for Boolean operators and then build the Tournament.

Exciting!

kai

By |2017-11-25T00:03:22-04:00April 14th, 2015|Ramblings of a Researcher|Comments Off on GP update 2015 04/14

Learning recursion

Yesterday presented a real mental struggle.

I entered my office at AIMS with good friend Adriaan who had spent the night at my flat. I walked him through my work in Genetic Programming, sharing the challenges and success to date. The next step was to flatten the GP tree into a live polynomial in order to push real-world data through and learn how each tree performs.

I had devised a bottom-up approach, analysing the GP tree structure using the 2D array which holds each node and all of its associated values. A series of nested for-next loops would build the formula, starting at the bottom and working to the top. A bit mechanical, but something I knew how to do.

Adriaan suggested a top-down recursive method. I understood the concept of recursion, but had never programmed one properly. He drew an example on the black board and I was lost. He drew another, and I remained lost. I need physical examples for my brain to grasp a concept, and recursion is fairly ambiguous by nature. I grew frustrated. And Adriaan had to leave for Town.

I worked on two other updates to my code. Now my operands and coefficients reside in external .csv files which are imported at run-time.

Arun arrived an hour later and suggested I write a basic recursion script to calculate factorials. Of course. That made sense. And it worked!

I then returned to my script and in about two hours more had it working. The challenge was fine-tuning the code to present 3 different levels of recursion depending on the ‘arity’ (number of child nodes) for each parent node. In the end, the number of lines of code was similar to my original approach, but recursion is more elegant … and I learned something new.

Thank you Adriaan and Arun!

Now, my GP code generates randomly generated mathematical polynomials which will soon be tested in a tournament to determine which ones will move into four types of mutation and reproduction to build the subsequent generation.

Progress!

By |2017-11-25T00:03:13-04:00April 7th, 2015|Ramblings of a Researcher|Comments Off on Learning recursion

My first GP polynomials!

First GP Tree by Kai Staats Having wanted to replace the grey matter in my head with something more valuable, eg: whipped cream, the recursive loop now generates polynomial strings!

In order to convert the resulting string to an executable polynomials, Arun suggested the library Sympy. Sympy even evaluates the algo, producing a simplified version and/or returning ‘0’ if it is not functional. If this works, I will not need to write an evaluator.

New to this version, the code now calls external .csv files for available functions and terminals. At the very bottom of the run, it auto-generates the polynomial.

Fun!

By |2017-11-25T00:03:03-04:00April 6th, 2015|Ramblings of a Researcher|Comments Off on My first GP polynomials!

My first GP trees!

First GP Tree by Kai Staats After 10 days coding, I have completed a GP tree generator!

Tested are Full and Grow trees through depth 5. Both parents and children are properly recorded. I can run ‘trees’ from the command line to view the Numpy array which holds the tree.

The Python code is coming along nicely. Clean, commented, and modular such that I will be able to extract all internal functions as external methods. A few more changes, such as making the section that builds the root a function, but it’s getting there.

It will be relatively simple to draw upon external data sets for the FUNCTIONS (operands) and TERMINALS (features) as the entire code base is designed to scale.

Yeah!

By |2017-11-25T00:02:54-04:00March 30th, 2015|Ramblings of a Researcher|Comments Off on My first GP trees!

Zen & the Art of Research

Our professor Bruce took us on a 5 days, zen meditation retreat. Yes, a meditation research retreat. How cool / weird / awesome is that?!

We spent 8 hours each day not talking, and then talked about not talking over dinner. Wasn’t all that different from normal research, in my experience. The venue was stunning. A gorgeous, isolated guest farm about two hours South and East of Cape Town.

Thank you Nadeem, Arun, Gilad, Eli, and Martin for a great week … of not talking.

On departure we learned the next group to come through the guest farm is an orgasm retreat. I think I signed up for the wrong week.

By |2017-11-25T00:02:47-04:00March 6th, 2015|Ramblings of a Researcher|Comments Off on Zen & the Art of Research

Concretely Andrew Ng

Today I completed the Andrew Ng open course on Machine Learning

Every morning for the past two months I have awaken (woke? waked? woke up?) at 6 am, on the beach by 6:30, then run, surfed, practised yoga or a combination for an hour. Back to my flat for breakfast and 1-2 Andrew Ng videos until 10 am. Down to AIMS for tea and into the office (where I am distracted by the view of the waves and beach).

Had to watch some of the lectures more than once, to absorb all that was presented. I paused at every formula in order to copy it into my small, spiral notebook. Over 50 pages in all. The first two chapters were hard to get through, but then I gained a kind of momentum–I even looked forward to the videos.

If you desire a crash course to Machine Learning, this is the way to go.

However, I hope to never hear the word “concretely” again.

By |2017-11-25T00:02:40-04:00March 1st, 2015|Ramblings of a Researcher|Comments Off on Concretely Andrew Ng

From Java to Python

Today I engaged Emmanuel in a Skype call to review my first Python translation of his Java code. I then sketched a workflow diagram (in gedit), which I delivered to Emmanuel for his review. Feels good to have made progress, even if just a few lines of code.

Given that I don’t know Java, this is going to be an arduous process.

By |2017-11-25T00:02:33-04:00February 28th, 2015|Ramblings of a Researcher|Comments Off on From Java to Python

Genetic Programming 101

Getting Started with GP by Emmanuel Dufourq

Epilogue
“Well, this is where it all started. A few lines of Java loosely translated to Python, the first three chapters of the “Field Guide to Genetic Programming“, and guidance from fellow researcher Emmanuel and officemate Arun, when I took wrong turns.

Had I known the effort would be not just six weeks, but six months, resulting in more than 2300 lines of Object Oriented code producing an extensible, multi-core platform for both symbolic regression and classification, with a user interface, well, I would have either been pleasantly surprised or run away screaming mad.

Either way, I look back and recognise how far I have come as a programmer, how much I have gained in training as a researcher, and how good it feels to have dedicated myself to a substantial task and followed through.” –kai, 26 September 2015

public Node createTree(int maxDepth, String type){

int random = gen.nextInt(4);
Node root;

if(random == 0){
root = new And();
}
else if(random == 1){
root = new Or();
}
else if(random == 2){
root = new If();
}
else{
root = new Not();
}

treeSize = 1;

populateTree(root, root.getLabel(), type, 1, maxDepth);

return root;
}

To read only essays and entries about my work in genetic programming and machine learning, select the category Ramblings of a Researcher.

By |2017-11-25T00:02:25-04:00February 17th, 2015|Ramblings of a Researcher|Comments Off on Genetic Programming 101
Go to Top