Burning in Greenplum cluster servers
In order to do some benchmarking for a Greenplum cluster I’ve modified the gensort program to generate a repeatable set of data that I can use to mimic impression and click data. I managed to put this to the test recently as we were breaking in some new hardware. The idea being to get the […]
Greenplum vs Hadoop Disk Space
I’ve been spending a whole lot of time calculating Greenplum vs Hadoop disk usage. So here the general equation (MaxAllocFactor * DiskSize * ( #Disk – RaidDisks ) ) / ReplicationFactor MaxAllocFactor = Max recommended allocation. 70% for Greenplum and 75% for Hadoop DiskSize = Size of your drive #Disk = Number of drives RaidDisks […]
End of Days
Greenplum Days has come to a close and the last two sessions showed they saved the best for last. The Expansion/Fault Tolerance talk was far and away the pinnacle learning session, very much highlighting how 4.0 fixes some of the pain points in 3.3.x version and you can see the foundational building blocks for some […]
Greenplum Days – Day 2
At this point it’s the afternoon of Day 2 of the two day Greenplum Days conference. As a non-C level person I have gotten very little exposure to what the inner workings of Greenplum Chorus is, I have a feeling that isn’t an accident as people of my level would nail Greenplum with a multitude […]
What’s that Greenplum thing
Tomorrow I catch my flight to head to Las Vegas so I can attend Greenplum Days on Monday-Tuesday. I’m pretty excited about this trip even though I will be in Vegas and essentially spending all of my time in a conference. You see the last month or so I’ve been heavily working with the Greenplum […]