FP Complete - Latest Commentshttp://fpcomplete.disqus.com/enSat, 12 Apr 2014 12:57:47 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1333507364<p>Just some non-conclusive numbers. For comparison, the Python version with reduce() and generators ran in 4.8s.<br>In the time that I tried to install Conduit and have a cabal hell (despite sandboxes :( ), I made a C version as benchmark: it ran in 0.16s. While still waiting, I compiled your first Python script with Cython, annotating the numeric types. The shared object ran in less than 0.8s, with slowdown to C mostly due to object parsing in function invocation.</p>Christian AndreettaSat, 12 Apr 2014 12:57:47 -0000Re: The Downfall of Imperative Programminghttp://fpcomplete.com/the-downfall-of-imperative-programming/#comment-1332213173<p>This text is very informative and inspiring. Thanks.</p>Fernando BassoFri, 11 Apr 2014 14:50:37 -0000Re: Calculating the Minimum Variance Portfolio in R, Pandas and IAPhttps://www.fpcomplete.com/blog/2014/04/mvp#comment-1323630159<p>This python code might be better for a performance comparison, since the full matrix inversion isn't necessary:</p><p># added a line to get test data<br>prices = np.array([np.random.randn(2000).cumsum() + 100 for i in range(1000)])</p><p>def minvar2(prices):<br>~ ~ cov = np.cov((prices[1:] / prices[:-1] - 1).transpose())<br>~ ~ vu = np.ones(cov.shape[0]) <br>~ ~ num = np.linalg.solve(cov,vu)<br>~ ~ den = np.dot(vu, num)<br>~ ~ return num / den</p>GuestTue, 08 Apr 2014 06:01:30 -0000Re: Calculating the Minimum Variance Portfolio in R, Pandas and IAPhttps://www.fpcomplete.com/blog/2014/04/mvp#comment-1323609147<p>It would be good to see performance figures for all 3 languages (with and without compilation in Haskell). I would have assumed all the solutions took a similar time: the time it takes the matrix library to get the covariance matrix. Saying that compilation reduced the time by a factor of 100 makes me wonder what the story is.</p><p>I'm glad to see a demonstration of atomic matrix operations. A suggestion for putting some real world complexity in would be to forward fill the NaNs in the price data, winsorise the returns at X basis points, and take a moving average over Y trading days. This might also be an opportunity to show how to manage global settings conveniently: In python, for instance, one might set these constants at the top of the program.</p><p>This python code might be better for a performance comparison, since the full matrix inversion isn't necessary:</p><p># added a line to get test data<br>prices = np.array([np.random.randn(2000).cumsum() + 100 for i in range(1000)]).T</p><p>def minvar2(prices):<br>~ ~ cov = np.cov((prices[1:] / prices[:-1] - 1).T)<br>~ ~ vu = np.ones(cov.shape[0]) <br>~ ~ num = np.linalg.solve(cov,vu)<br>~ ~ den = np.dot(vu, num)<br>~ ~ return num / den</p>HMTue, 08 Apr 2014 05:33:11 -0000Re: Calculating the Minimum Variance Portfolio in R, Pandas and IAPhttps://www.fpcomplete.com/blog/2014/04/mvp#comment-1323593992<p>I think your Haskell version is incomplete, check line 2. And there is numpy.eye() constructor for unit vectors.</p>Maxim YegorushkinTue, 08 Apr 2014 05:11:24 -0000Re: Emacs and our API | FP Completehttps://www.fpcomplete.com/blog/2013/12/api-emacs#comment-1321920261<p>Looks very cool. I have a couple questions. (a) My employer won't allow sensitive code to be edited on outside servers, no matter how secure they are. Is there a way to install everything locally? (b) I'm comfortable with elisp, but I might want to start scripting some emacs behaviors using Haskell. Do you have a good elisp<->Haskell binding? Thanks.</p>John MatthewsSun, 06 Apr 2014 22:35:35 -0000Re: Calculating the Minimum Variance Portfolio in R, Pandas and IAPhttps://www.fpcomplete.com/blog/2014/04/mvp#comment-1319057088<p>Python Pandas, though prominently in the heading, is not mentioned in the text, or did I miss something?</p>Andreas ReuleauxFri, 04 Apr 2014 17:23:58 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1315306063<p>Not going into a language war (wait for "but"), since I really like both languages, BUT (there you go) in the real world nobody would do numerical programming using Python's own list comprehensions – what if you want to parallelise your code – and there are a lot of other (efficient) ways to do it, most of which are generally ugly and require C/Fortran knowledge but they work pretty well and they're fairly easy to do.</p><p>I would suggest another one-liner just to boast a bit which just uses numpy.random.uniform and python's generators. If you want to make it memory efficient you can replace numpy.random.uniform with a generator, but it gets obviously slower.</p><p>4*sum(1. if x*x + y*y < 1 else 0 for (x,y) in zip(numpy.random.uniform(0,1,100000),numpy.random.uniform(0,1,100000)))/100000</p>Marco ConuncognomebuffoWed, 02 Apr 2014 14:11:26 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1314746986<p>And since Monte Carlo is "embarrassingly parallel", I did just that and got a x2 - x3 speedup.</p>idontgetoutmuchWed, 02 Apr 2014 08:31:42 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1314724720<p>FWIW I just tried comparing Python using the generator reduce snippet below and Haskell and got</p><p>~/Dropbox/Private/HasBayes $ ghc -O2 Pi.hs<br>[1 of 1] Compiling Main ( Pi.hs, Pi.o )<br>Linking Pi ...<br>~/Dropbox/Private/HasBayes $ time ./Pi<br>3.141126<br>real 0m3.510s<br>user 0m3.458s<br>sys 0m0.051s</p><p>~/Dropbox/Private/HasBayes $ time python <a href="http://Pi.py" rel="nofollow">Pi.py</a><br>3.14215381506<br>real 0m6.151s<br>user 0m6.081s<br>sys 0m0.021s</p>idontgetoutmuchWed, 02 Apr 2014 08:11:53 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1314692286<p>I never meant to imply that conduit is the only way to solve this problem in Haskell. I used it here because (1) it provides for a compact, declarative solution, (2) it naturally bypasses the memory leak in the original Python implementation, and (3) it's what we're going to be using the IAP release.</p>Michael SnoymanWed, 02 Apr 2014 07:30:46 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1314665070<p>I am not clear that you have to use conduit. I just wrote a short blog on sampling from Student's t in Haskell: <a href="http://idontgetoutmuch.wordpress.com/2014/04/02/students-t-and-space-leaks/" rel="nofollow">http://idontgetoutmuch.wordpre...</a> using random-fu.</p>idontgetoutmuchWed, 02 Apr 2014 06:52:23 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313247826<p>This is an area where we can add much value. The IAP already has tools and an EDSL for simplifying the task of gleaning the input model from the input source and building pipelines out of conduits, and defining runtime parameters used in analysis and visualizations . Our goal is to add more and more domain specific optimizations in addition to library performance improvements to the IAP so newbies (or experts for that matter) don't have to do this themselves.</p>Gregg LebovitzTue, 01 Apr 2014 08:37:43 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313211373<p>It didn't sound negative to me. It's perfectly reasonable to challenge claims in a blog post. There are far too many blog posts out there demonstrating amazing benchmarks that are somehow false.</p><p>I'm hoping in the next month or two to write up a blog post on the rewrite rule improvements to conduit and demonstrate it with some real benchmarks. I hope you'll test the results against Matlab again at that time :)</p>Michael SnoymanTue, 01 Apr 2014 07:58:10 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313174969<p>It sounds interesting, but as a newbie I don't think I could manage to do this. I also wonder how flexible this approach is, for example if you decide to change the algorithm somehow, since there are quite a few number of steps.</p>twendeTue, 01 Apr 2014 07:21:21 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313166868<p>That sounds great! I actually missed the part where you mentioned improvements in the blog post, sorry about that.</p><p>Reading my own post again, maybe I came across as a bit negative. From what I have learned about Haskell and its performance (e.g. different kinds of fusion and Intel's Haskell Research Compiler) I get the impression that Haskell could be a top contender regarding performance in numerical computing. I am happy every time I read about someone working on closing the gap to the conventional languages used for numerics (thinking of C/C++ and Fortran here).</p>twendeTue, 01 Apr 2014 07:14:03 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313089744<p>Doh! I got the exact same comment during a review of my Python code by Mike Meyer. I thought I'd made the update, but apparently I forgot to commit it.</p>Michael SnoymanTue, 01 Apr 2014 06:09:46 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313087870<p>As I mentioned, there's room for improvement in the conduit code. I've experimented with a number of rewrite rules locally that brought my runtime from 3 seconds to half a second. I haven't released those changes yet since they're currently somewhat ad-hoc, and I'd like to test things a bit more thoroughly. But those changes- in one form or another- will be part of the IAP release.</p>Michael SnoymanTue, 01 Apr 2014 06:06:37 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1313087691<p>You should try PyPy, it was many times faster than CPython for this task at my machine.</p>k_bxTue, 01 Apr 2014 06:06:20 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1312548290<p>My usual approach is to build the code in Haskell in an EDSL that fits the domain, generate code I can optimize, use domain specific optimizations, and hand it off to something like LLVM so I can run it wherever and remove any such overhead. With that approach you can remove the multiplier, and still get the benefit of Haskell as a host language when it comes to composing the code functionally.</p>edwardkmettMon, 31 Mar 2014 19:36:17 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1312519545<p>I really like to learn more about how to use Haskell for numerical computing, and how Haskell's nice properties can yield performant and modular code. However, when doing Monte Carlo simulations you usually want it to be *fast*, and I wonder how often Haskell can cut it in that regard, when choices are made.</p><p>For example, I compared the performance of your Haskell code on my computer (GHC 7.6.3) with that of two different Matlab functions, copying the style of your Python functions (matrix and looping). I did no optimization whatsoever, just wrote the code and ran it. The matrix version was about 9 times faster than the Haskell version; the looping version was about 3 times faster. The memory usage of the matrix version can quite easily be handled by chunking the data in the size of your choice (although it is not pretty).</p><p>If one code is nine times slower than another, you can have a situation where the fast one finishes in one working day (8h) while the slow one takes three full days. Even if the slow code is easier to develop, understand and extend, it might therefore not be the obvious that you should choose that instead of the fast and ugly code.</p>twendeMon, 31 Mar 2014 19:11:57 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1312236539<p>And just for the sake of fairness, this works in constant memory by using python's generators <a href="https://gist.github.com/k-bx/989f777fa8b79e172d3b" rel="nofollow">https://gist.github.com/k-bx/9...</a></p><p>There *is* one problem with python's iterators though. Unlike conduit/pipes, they are quite easy to miss and force entire dataset, or (in other times) you have to modify your algorithm because of no look-ahead. They're more like Haskell's laziness, which also works sometimes to keep you in constant memory, and sometimes doesn't. So, to sum up this paragraph, you are absolutely correct regarding "solution that scales".</p>k_bxMon, 31 Mar 2014 15:41:51 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1312200524<p>Maybe this is more readable a bit:</p><p>print(average([x * x + y * y < 1 for [x, y] in rand(10000000,2)]) * 4)</p>k_bxMon, 31 Mar 2014 15:15:16 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1312093081<p>Looping Python algorithm would work faster if we keep variable successes integer. Forcing floating division can be done later:</p><p> print (successes + 0.0)/count*4</p>Boris LykahMon, 31 Mar 2014 13:58:22 -0000Re: Monte carlo analysis in Haskellhttps://www.fpcomplete.com/blog/2014/03/monte-carlo-haskell#comment-1311992749<p>If a Python list comprehension is taking up too much memory, we can often replace it with a generator (replace "[...]" with "(...)"). I tried this in the first snippet, but NumPy didn't like it (apparently generators are "iterable", but not "array-like"). Wrapping the generator in numpy.fromiter made it run, but gave very little benefit.</p><p>I then tried replacing the imperative version with a generator and a reduce, which managed to reduce the time and memory usage considerably. Here's my code:</p><p> (n, d) = reduce(<br> lambda (num, denom), (x, y): (num + (x*x + y*y < 1), denom + 1),<br> (((random.random(), random.random()) for x in xrange(10000000))))<br> print (n / d) * 4</p><p>This gives plausible results (eg. 3.14230378908) with a 'maximum resident set size' of 14672kB in 5.07s (as reported by time -v). For reference, the list comprehension snippet uses 258948kB in 20.84s and the imperative snippet uses 14656kB in 39.29s.</p>Chris WarburtonMon, 31 Mar 2014 12:46:11 -0000