The chap who runs my AirBnB went through basic and is now part of a mobile air defence unit in Kyiv.
We were talking about New Year’s Eve/Day, which were busy, and he sent me some footage from the 2nd Jan (which had been quiet for me - but you only really know about whatever is close enough for you to see or hear). I’ve annotated the footage to the best of my knowledge and local experience.
00:00 : to begin with, we see (and hear) AAA fire in the sky, by dint of tracer rounds. I wondered why, given its all radar guided. My cousin (former Captain in the UK army) tells me standard ammo boxing has one round in five being tracer. I can imagine it’s useful, as it helps inform other defenders you’re firing at something.
00:07 : then a single point of light rises up into the sky; this is a surface-to-air missile, I would guess shoulder-launched. It disappears when it comes into the cloud base (which is not particularly visible, as it is dark). Missiles seem surprisingly slow, but we are looking here at something covering a couple of kilometres in a couple of seconds. I’ve seen these things myself; the CCD blurs out their exhaust a little. To the eye, they’re sharp points of light.
00:07 : while the missile is rising, there’s an quick single explosion in the sky over to the right, below the datestamp. This I think from its size and shape is a drone being blown up.
00:15 : a very quick explosion a bit to the right of the top center - I suspect this is the missile from 00:07 finding its mark. I suspect what you or I would have seen, had we been there, with the human eye, would have been somewhat different to what the CCD captures; I think that explosion was above the cloud base, and so we would have seen a white flash lighting up the clouds from above. I suspect the CCD doesn’t pick up such a sky-wide flash of light very well, but it does pick up the core of the explosion, even through the clouds.
00:16 : AAA fire begins again, from the left.
00:17 : then there follows, in line the AAA fire, first, two big explosions right next to each and one right after the other, then a short string of similar explosions, a bit further on, in the line of fire. I think the first two explosions are from proximity fuses (fuses with little radars in, invented during WW2), and then the late explosions are timed fuses, in part to try and bracket the target, and in part to stop the rounds flying off into the distance and then coming down onto residential Kyiv.
00:18 : while the AAA is ongoing, another shoulder-launched surface-to-air missile begins to rise into the sky, from the same location and so the same team who fired the first missile. As that missile proceeds, we become aware it is fact coming towards us, and it is indeed aimed at a target behind the camera, and so it flies toward the camera, and then passes overhead. The smoke trail is very clear, even in the dark.
00:24 : the missile which passed overhead finds it mark, the explosion being clear, loud and nearby. That type of bang, very crisp, is an explosion in the air. Ground explosions are longer, deeper, much more rolling and “boomy”. Air explosions are like a balloon bursting.
00:25 : more AAA rises up into the sky, from the same source, on the left. While this is occurring, a rapid sequence of explosions occur in the very top right. This I think is AAA fire, detonating around a target (proximity fuses), as the explosions are multiple, rapid, and in the same location.
00:29 : now, this I don’t quite get; we have another string of explosions, as at 00:17 - but it’s not in line with the AAA fire we can see rising into the sky. I’m not sure where this comes from. I do note the location is pretty much the same as the earlier string of explosions.
In any event, the explosions illuminate a smoke trail, hanging in the air, which given its location is very likely from an earlier shoulder-launched surface-to-air missile, from the same team we saw firing at 00:07 and 00:24.
00:32 : the blast waves from explosions are often strong enough to set off nearby car alarms, which is what now occurs.
00:34 : explosions in the top-right, the edge of which can just be seen, but which can be heard clearly.
00:36 : and here - another string of explosions, and again, unconnected to any tracer fire, in much the same place as the earlier strings of explosions, with the lingering missile smoke trail again being made visible by the light. My guess is some of the ammunition being fired had no tracer rounds.
00:40 : a big flash, at ground level, just off the left side of the screen; that was an incoming drone or missile, and it hit something, as the street lighting goes out about a second later. Exactly what it hit, I do not know. It might just have blown up a bunch of transmission wires, or it might have hit a substation or something like that directly (although it would need some luck to do so - the drones and missiles in use do not seem all that accurate). I have no idea if, or how many, people caught it.
After that, it’s over, just like that. Couple of cars drive by, there’s curfew at 11pm and I think this was after, so they would be military, or civilian authorities.
I would here now draw your attention to a larger, hopeful, and encouraging picture : that I am here in Kyiv, writing this, because the West has stepped up, and provided weapons, supplies and support, to Ukraine.
The actions of the Governments of the most of the people who will read this have been and are absolutely central to the strong and ongoing resistance by Ukraine. We are not bystanders or helpless : we are all, through our taxes put to this end, doing our bit, and it’s working.
Putin’s underestimation of the power of the West is as great as that power will prove to be.
In Postgres, roles replaced users and groups (both become roles). Roles in Redshift do not replace users or groups, but exist alongside them, as a third, separate, first-class type. Roles, like groups, can be granted privileges, but roles, unlike groups, can be granted to roles. Granting a role is in effect the same as adding a user to a group. Along with roles come a new set of Redshift-specific privileges, which can be granted to and only to roles (roles can also be granted the existing Postgres-style privileges). These new privileges are unlike the existing privileges (which are per-user, per-object) as they are global; they apply to everything, everywhere, always, in all databases. They are essentially fragments of the super user. Finally, note the documentation for roles is particularly egregious, and there are a few of the new privileges which are properly tantamount to granting super user, as they allow a user to elevate themselves to super user.
ap-southeast-1
dc2.large
benchmarked
0.3s for the disk read benchmark, vs the usual 0.06s. May just be a bad
cluster.
ap-southeast-1
dc2.large
has received a
large upgrade to the disk-read-write benchmark and the network
benchmark. Disk-read-write has gone from the historically constant and
slow ~13.5s to the normal ~4.5s or so, network from the usual ~12.3s to
the normal ~4s. Disk read is still slow (0.14s vs 0.06s) but this region
has now seen a serious upgrade, moving from the slow lane to the normal
lane.
eu-west-1
dc2.large
disk read benchmark
also having a bad hair day, clocking at 0.18s, rather than the usual
0.06s.
https://www.amazonredshift-observatory.ch/cross_region_benchmarks/index.html
I’ve been working for many months on a browser-based AMI for managing Redshift - and it brings me very great pleasure to proudly present the one, the only, the amazingly fantabulous - Max Ganz’s Redshift Combobulator!
There is now a live Combobulator running, which is pointing at a Redshift cluster I’ve populated with an artificial database.
As of 19:00 UTC 2023-02-13 the site is up, and I’m babysitting it, and I’ll leave it running overnight and tomorrow.
To avoid bots walking all over it, I’m going to obfuscate the IP address;
https://
fifty-four.
one-hundred-and-sixty-two.
fifty-eight.
one-hundred-and-twenty-seven
Note - the AMI is designed to run locally, completely firewalled (for PII) and so it has to use a self-signed certificate, because there is no IP name.
Just click through it the warning the browser gives (I promise not to use your credit cards :-)
If you have any questions, or, Heaven forbid!, problems, email me! I’ll be keeping an eye on email, so I should respond in a timely manner.
The ra3.xplus
node type in
ap-northeast-2
, ca-central-1
,
eu-south-1
, eu-west-1
, eu-west-2
,
us-west-1
and us-west-2
all experienced issues
with the disk-read-write benchmark.
The usual performance, which is highly consistent with this node type is, 2.40/0.01s (mean and standard deviation). In all these regions, the benchmark now is about 3.40/2.5s - a huge standard deviation. Something is glitching in all these regions.
ap-northeast-2
dc2.largr
network
benchmark is back to normal, after a month of being slow (>7s vs
~4s).
ap-southeast-1
dc2.large
disk-read
benchmark has returned to normal after last benchmark’s unusually slow
(and possibly just a bad cluster) result.
ap-southeast-3
dc2.large
looks to have
received the disk-read upgrade, and is now at 0.06s (rather than the
slower class of 0.14s nodes).
ca-central-1
dc2.large
has
lost the disk-read upgrade, and is now at 0.14s. I begin to
suspect there may be multiple populations of the same node type. I need
to sit down and run a set of benchmarks where I take one region and
simply run the benchmarks on one after another cluster, for say ten
clusters in a row, and see how much variance is found.
me-south-1
ra3.xlplus
processor
benchmark returned to normal (2.19/0.00s now, last benchmark was
3.47/0.77s).
us-east-2
ra3.xplus
disk-read-write
went from slow (3.50/2.54s - the huge standard deviation again, as seen
in many regions this benchmark) back to normal (2.49/0.02s).
us-west-2
dc2.large
and
ra3.xlplus
both received processor upgrades, going from
3.95/0.08 to 2.91/0.02, and 2.05/0.06 and 2.21/0.01, respectively.
Normally the processor result is pretty much bang-on identical across
the bi-weekly benchmarks, so these are stark and large
upgrades.
https://www.amazonredshift-observatory.ch/cross_region_benchmarks/index.html
A small note to say releases seem to have been coming much more slowly over the last two or so months.
https://www.amazonredshift-observatory.ch/redshift_version_tracker/index.html
Updated a week ago with the new 1.0.46607 release.
https://www.amazonredshift-observatory.ch/system_table_tracker/index.html
Republished today.
I think I may need to partition the results into months or something like that, so changes over time can be seen. I changed the order of the time columns from mean/sd-max-min to max-mean/sd-min, which I think is significantly easier to take in.
https://www.amazonredshift-observatory.ch/bring_up_times/index.html
Not one price has changed in the history of tracking =-)
https://www.amazonredshift-observatory.ch/redshift_price_tracker/index.html
The AWS Big Data Blog, on the 13th Feb 2023, so just a few days ago,
published a blog post about a migration to ra3
by a
company, OLX.
I worked at OLX, as a Redshift specialist contractor, for about four months, over the end of 2019, start of 2020.
I was admin on and working on the very Redshift cluster that was migrated.
I worked with Miguel (one of the two blog post authors, the other being from AWS) most days over that time.
Now, by and large, you never speak about the internal state of a client. It’s a simple breach of confidence. You would no more do so that pass on a secret your best friend had confided in you.
However, I have some concern about what has been published in the blog post. I feel it paints a significantly inaccurate picture, and that this picture is being used by AWS to persuade people to give money to AWS.
Given these considerations, I’ve written about the blog post, but with tact and discretion, and reducing information about OLX itself to the absolute minimum. I speak as much as possible in general terms.
So, having explained the situation, let’s have a look at what’s been written.
I’ve quoted only relevant sections. To see the whole blog post, go and read it.
The first thing then to mention, is that this in fact to my knowledge was not the first migration attempt. There was one before this, after I left, which was as I understood it, pretty much unplanned - suck it and see. The cluster was moved over, and it did not work well, and they rapidly reverted.
This migration, described in the blog post, was to my eye, much more
carefully planned, and in the end, they were able to stay on
ra3
.
It is important to publish failures, not only successes, so that people can know what they really do need to do, to be successful.
(I must mention that I think Miguel would not have been in charge of that earlier attempt; what you see in the blog post, the preparation and care, that’s him - I worked with him, and I think he’s good; careful, thoughtful, thorough, and a thoroughly nice chap.)
Before migrating to RA3, we were using a 16 DC2.8xlarge nodes cluster with a highly tuned workload management (WLM), and performance wasn’t an issue at all.
Redshift is highly knowledge intensive. The admin, devs and the users must fully understand the impact sorting has upon table design and queries. When there are large numbers of ordinary users, this is highly unlikely to be the case.
As such, and speaking now in general terms about Redshift in industry, I may be wrong, but I am of the view about 95% of all Redshift clusters are incorrectly operated and the users are having a terrible time; typically, everyone involved does not know about sorting or its implications for them and their work and Redshift is used as if it were Postgres - but with the expectation that as a “data warehouse” it has much greater capacity.
In fact, Redshift is a wholly different type of database, with entirely different use cases. It is not a general purpose database, and cannot be used by all and sundry. It is extremely knowledge intensive.
I must also mention the issue of selection bias. It can be potential users of Redshift, data analysts for example, have other data processing systems available to them, and if they are not happy with Redshift, they stop using Redshift. Any data system which has issues, those users who are unhappy and can leave, do leave, and you’re left only with those who are getting on just fine, and those who cannot leave, and they I find usually give up complaining because nothing changes.
However, we kept facing challenges with storage demand due to having more users, more data sources, and more prepared data. Almost every day we would get an alert that our disk space was close to 100%, which was about 40 TB worth of data.
I think when there are large numbers of users, and their database use is uncontrolled, there is normally going to be a surprisingly rapid and ongoing increase in disk use.
With a Redshift cluster, once you get to 90% full, you start finding yourself spending time keeping the cluster running. Once it gets to 95% full, you’re spending all your time keeping the cluster running.
Being short on disk space, and having a considerable and ongoing increase use of disk space, obviously leads to a difficult situation. What’s needed are more nodes - to expand the cluster - but here we can run into administrative issues; there can be irreducible delays in getting approval for new nodes, perhaps a need to wait for the next budget cycle - indeed, there can even be political delays, where questions come to be asked as to why the cluster is growing so quickly.
(In fact, part of what I did there was write a generalized table DDL
modifier, which converted a large number of small tables to unsorted,
and freed up about 20% disk space. It was quite a piece of work - if you
want to change a table, you have to figure out its dependencies,
enumerate them all, record their DDL (views, etc) so you can recreate
them, and record all privs which have been granted on everything, drop
all dependencies, copy the table into its new form, then rebuild all
dependencies - in the correct order - and regrant all privs. I wrote
this code, although I’d written quite a lot of it in my own time prior
to the gig, and just brought it into play. One off-thought about this…
this was before auto
for table sorting, so not specifying a
sortkey meant you were unsorted. However, when auto
was
introduced, not specifying a sortkey means you get auto
.
When this feature was introduced, did it convert all those unsorted
small tables to auto
? did a lot of them stop being
unsorted, and so end up going back to using huge amounts of
overhead?)
Our usual method to solve storage problems used to be to simply increase the number of nodes. Overall, we reached a cluster size of 18 nodes. However, this solution wasn’t cost-efficient enough because we were adding compute capacity to the cluster even though computation power was underutilized.
In the usual case, when Redshift is incorrectly operated, almost all joins are hash joins, because admin and users lack the necessary knowledge to allow for merge joins. Hash joins are expensive in every way - disk space, network, compute.
When a cluster is operated incorrectly, I would always expect compute to be heavily used, because everything is a hash join.
This cluster’s performance was generally good, ETL (extract, transform, and load) and interactive queries barely had any queue time, and 80% of them would finish in under 5 minutes.
If the ETL job was processing a small amount of data, perhaps it ought to finish in 30 seconds, and if it’s taking 300, then that would mean there is a problem - it means nothing to only say how long a job takes.
I note also specifically interactive queries had barely any queue time - but what proportion of queries were interactive? and how long did they take once they were running?
To my eye, this paragraph is too brief, and by it, creates questions rather than providing answers.
Initially, we chose the RA3.4xlarge node type for more granular control in fine-tuning the number of nodes. However, we overlooked one important detail: the same instance type is used for worker and leader nodes.
Now, this is interesting. This is a basic Redshift fact. OLX may not have known this, but the AWS staff involved would have.
A leader node needs to manage all the parallel processing happening in the cluster, and a single RA3.4xlarge wasn’t enough to do so.
In the normal case, I am of the view the leader node for a cluster,
regardless of node type, is just fine. I think the AWS staff though this
too - and bear in mind here, OLX went to an 18 node cluster when they
were trying ra3.4xlarge
- but the maximum number of nodes
is 128! it’s not like they were pushing the limits. I also have reports
from skilled Redshift admin running very large node count clusters that
the leader node can seem to become a processor-time bottleneck, when you
have a lot of nodes, but apart from that, it’s okay.
To my eye then, something with the workload on that cluster was abnormally stressing the leader node, rather than the leader node in general normally being underpowered; and this caught out the AWS staff, because it’s out of the ordinary.
Use 6 x RA3.4xlarge, restored from 6 x RA3.4xlarge snapshot.
Use 18 x RA3.4xlarge, elastic resize from 6 x RA3.4xlarge.
This was likely a mistake. A new cluster of 6
ra3.4xlarge
would have 4 full slices per node - a total of
24 full slices. The elastic resize to 18 nodes would have distributed
those slices over the 18 nodes, so 8 nodes would have 1 full slice, and
8 nodes 2 full slices - the balance on each node, to 4 slices, being
made by partial slices, which do not read data from RMS and
participate in a limit number of step types only.
AWS would have known this, and they would not have told OLX - AWS tell clients “they don’t need to know about the difference”. AWS could not however I would say have said anything about that this resize would lead to what was I expect quite different performance to that after a classic resize, because it would require them to say a classic resize would be necessary, but a classic resize would take days; information too negative to share.
In case the performance of hourly and daily ETL is not acceptable, the contingency plan is triggered:
Add one more node to deal with the unexpected workload. Increase the limit of concurrency scaling hours. Reassess the parameter group.
The next morning, we woke up to what we dreaded: a slow cluster.
Which must mean queries were in fact running more slowly than before.
There are to my eye two primary considerations when moving to
ra3
.
The first is that dollar for dollar, you’re getting less RAM. I have
heard of other skilled Redshift admin experimenting with migrations, and
this seemed to be the issue they were running in to - the conversion
ratio which AWS present for dc2
and ds2
to
ra3
is over-optimistic, and if you follow it, you end up
with insufficient memory, and performance can fall off a cliff.
The second issue is that on dc2
and ds2
,
all storage is local and runs at the speed of local disk. With
ra3
storage is now in the cloud, and my benchmarking of
ra3.xlplus
indicates RMS runs at 25% of the speed of the
local disk (which is a block-based LRU cache of RMS).
So, put simply, the storage devices have been reduced in speed by a factor of 4 (but now have a local cache) - in other words, storage is slower. Whatever you might say about this, you can’t say you expect performance to increase because of the move to RMS.
We triggered our contingency plan and in the following few days we ended up implementing all three actions we had in the contingency plan.
Adding one extra node itself didn’t provide much help,
Presumably an elastic resize, as it seems it was done quickly, which would not have given quite the changes OLX were expecting.
however users did experience good performance during the hours concurrency scaling was on.
I am very strongly against concurrency scaling. First, I consider it inappropriate - palliative. Redshift scales by operating sorting correctly which means queries are lightning fast. You do not need to run many queries in parallel, because the queries which are issued are over very quickly. When Redshift is operated incorrectly, and queries are slow because sorting is not being used, concurrency scaling can help; but it’s expensive. You pay at the per-second rate.
In this situation, it was expedient.
The concurrency scaling feature allows Amazon Redshift to temporarily increase cluster capacity whenever the workload requires it. We configured it to allow a maximum of 4 hours per day—1 hour for free and 3 hours paid. We chose this particular value because price-wise it is equivalent to adding one more node (taking us to nine nodes) with the added advantage of only using and paying for it when the workload requires it.
Right, but I would expect all the CSC time is being used.
Therefore, we decided to try auto WLM with the following configuration.
I am very strongly against AutoWLM. It is a black box, which has a troubled history, and it changes without notification from AWS. You cannot knowingly design a correct system when that system contains black boxes, let alone black boxes which change without warning.
What I currently know of AutoWLM (information and rumours slowly emerge, one way and another) is that every 200 queries, AutoWLM recalculates slots and memory allocation.
Problem is if you run a bunch of light queries, and so get tiny slots, and then run a big query, suddenly you have a problem.
What I hear is that people using it see it runs fine for a while, and then sooner or later, a big backlog of queries forms.
I would be interested to know more about OLX’s experience with AutoWLM, and I am not sure that we are necessarily getting the whole picture here.
After applying concurrency scaling and auto WLM, we achieved stable performance for a whole week, and considered the migration a success.
Stable performance for a week, I have to say, feels like an underwhelming attainment; but I think here Miguel’s innate truthfulness as an engineer is manifesting itself.
We are now at 300 monthly active users. Cluster costs did increase due to the new node type and concurrency scaling,
Which brings to mind the blog post title;
How OLX Group migrated to Amazon Redshift RA3 for simpler, faster, and more cost-effective analytics
Cost-effective, by being more expensive?
AWS will have picked the blog post title, of course, not Mig or OLX.
but we now feel prepared for the future and don’t expect any cluster resizing anytime soon.
So, I may be wrong, but to my eye, what’s happened here is that the disk space issue has been resolved, from the point of view of the Redshift admin team.
Previously, with dc2
, new nodes were needed, and they
were regularly needed, and this was a problem. Now with
ra3
, there’s RMS, the rate at which new data is coming in
hasn’t changed at all - and it’s being paid for, RMS is not free - but
it no longer intrudes. There’s no longer a discrete particular bill for
another reserved instances; RMS costs are just a quiet part of the
ongoing AWS bill.
This benefit has come at a price; the cluster is slower, and so now needs concurrency scaling and concurrency scaling is expensive for what it is, and the nodes in the cluster itself are more expensive than before.
I have to say it, also, I can’t see anyone here asking the users how things are for them; and changing the node type is of course not going to influence whether the cluster is correctly or incorrectly operated.
(To be fair, no one ever does ask users. Users to my eye are like tax payers using free at the point of use health services; they have no power or influence of any kind. The only way users have power is if they pay for the services they use, and can go elsewhere.)
Let’s have a look at pricing.
Originally, 18xdc2.8xlarge
, no concurrency scaling.
In the Frankfurt region, with one year reservation and all up-front, this costs 35,910 per node, which is 646,380 USD in total.
Now we have 8xra3.16xlarge
, I’ll work out RMS and
concurrency after the nodes.
In the Frankfurt region, with one year reservation and all up-front, this costs 90,065 per node, which is 720,520 in total.
RMS is 2.4 cents per gigabyte per month, so 40 terabytes, which is 40,960 gigabytes, is 983.04 USD per month, which is 11,796 USD per year in total.
CSC is three hours per day (I’ll assume weekdays only) at the per-second rate.
For ra3.16xlarge
, that’s 15.578 USD per node per hour,
so three hours a day, 260 working days (give or take one or two) per
year, eight nodes, so 97,207 USD per year.
Grand total : 829,523 USD.
Previous total : 646,380 USD.
Increase : 183,143 USD.
Jesus. I could save these guys so much money :-)
Know what would be interesting? figuring out the hardware cost of the nodes, and then thinking about just buying them (one off cost of 150k maybe? including 80 TB storage) and running clustered Postgres on them.
The best photo I’ve taken in Kyiv.
In the bi-weekly benchmarks, from fortnight to fortnight, there’s a
lot of variation between dc2.large
clusters.
Now, it could be that those clusters are identical if you fire them up one after the other, and the changes seen in the bi-weekly benchmark reflect longer-term changes in the nodes or their environment - or it might be clusters vary a lot, even if you’re just making one cluster directly after another.
I’ve been meaning to investigate this for ages.
So this evening, finally, I fired up and benchmarked ten
2xdc2.large
clusters, one directly after the other.
Here are the results;
cluster_id | bring_up_start_utc | bm_name | node_id | max_duration | avg_duration | min_duration
------------+----------------------------+---------+---------+-----------------+-----------------+-----------------
pIDw0 | 2023-02-24 18:01:29.844692 | dr | 0 | 00:00:00.06885 | 00:00:00.065582 | 00:00:00.063572
MjdMZ | 2023-02-24 18:07:20.308433 | dr | 0 | 00:00:01.051501 | 00:00:00.716193 | 00:00:00.346439
Cz6Gl | 2023-02-24 18:13:32.535197 | dr | 0 | 00:00:00.063639 | 00:00:00.062683 | 00:00:00.062258
dga69 | 2023-02-24 18:18:09.281297 | dr | 0 | 00:00:00.06043 | 00:00:00.059754 | 00:00:00.059462
pCsny | 2023-02-24 18:26:21.088642 | dr | 0 | 00:00:00.064252 | 00:00:00.062574 | 00:00:00.061975
GnLyz | 2023-02-24 18:39:44.874939 | dr | 0 | 00:00:00.064635 | 00:00:00.062722 | 00:00:00.060401
lvmWk | 2023-02-24 18:50:59.088157 | dr | 0 | 00:00:00.217796 | 00:00:00.115286 | 00:00:00.0661
cC1v9 | 2023-02-24 18:58:49.007792 | dr | 0 | 00:00:00.07829 | 00:00:00.068956 | 00:00:00.063517
MBVwj | 2023-02-24 19:10:12.009875 | dr | 0 | 00:00:01.046866 | 00:00:00.729639 | 00:00:00.349607
QoJ3w | 2023-02-24 19:16:48.52368 | dr | 0 | 00:00:00.692264 | 00:00:00.524915 | 00:00:00.25648
pIDw0 | 2023-02-24 18:01:29.844692 | drw | 0 | 00:00:08.214498 | 00:00:05.606798 | 00:00:03.041891
pIDw0 | 2023-02-24 18:01:29.844692 | drw | 1 | 00:00:10.011731 | 00:00:05.299469 | 00:00:03.147677
MjdMZ | 2023-02-24 18:07:20.308433 | drw | 0 | 00:00:06.893068 | 00:00:04.382553 | 00:00:03.062129
MjdMZ | 2023-02-24 18:07:20.308433 | drw | 1 | 00:00:10.461527 | 00:00:06.719437 | 00:00:04.182195
Cz6Gl | 2023-02-24 18:13:32.535197 | drw | 0 | 00:00:12.364854 | 00:00:05.640658 | 00:00:03.179939
Cz6Gl | 2023-02-24 18:13:32.535197 | drw | 1 | 00:00:11.404383 | 00:00:07.081053 | 00:00:03.948546
dga69 | 2023-02-24 18:18:09.281297 | drw | 0 | 00:00:07.586488 | 00:00:04.774773 | 00:00:03.066495
dga69 | 2023-02-24 18:18:09.281297 | drw | 1 | 00:00:07.84923 | 00:00:05.224053 | 00:00:03.169702
pCsny | 2023-02-24 18:26:21.088642 | drw | 0 | 00:00:09.104722 | 00:00:04.521909 | 00:00:02.987616
pCsny | 2023-02-24 18:26:21.088642 | drw | 1 | 00:00:07.463868 | 00:00:05.39949 | 00:00:02.967225
GnLyz | 2023-02-24 18:39:44.874939 | drw | 0 | 00:00:13.019179 | 00:00:07.253439 | 00:00:03.191199
GnLyz | 2023-02-24 18:39:44.874939 | drw | 1 | 00:00:13.98035 | 00:00:07.360802 | 00:00:03.044719
lvmWk | 2023-02-24 18:50:59.088157 | drw | 0 | 00:00:07.401188 | 00:00:04.765769 | 00:00:03.098318
lvmWk | 2023-02-24 18:50:59.088157 | drw | 1 | 00:00:12.000361 | 00:00:06.5008 | 00:00:03.297659
cC1v9 | 2023-02-24 18:58:49.007792 | drw | 0 | 00:00:06.068048 | 00:00:04.266258 | 00:00:03.046517
cC1v9 | 2023-02-24 18:58:49.007792 | drw | 1 | 00:00:09.708544 | 00:00:06.930116 | 00:00:03.12746
MBVwj | 2023-02-24 19:10:12.009875 | drw | 0 | 00:00:07.388561 | 00:00:04.866444 | 00:00:03.137688
MBVwj | 2023-02-24 19:10:12.009875 | drw | 1 | 00:00:10.531674 | 00:00:07.835683 | 00:00:04.639588
QoJ3w | 2023-02-24 19:16:48.52368 | drw | 0 | 00:00:06.425808 | 00:00:04.532703 | 00:00:03.205344
QoJ3w | 2023-02-24 19:16:48.52368 | drw | 1 | 00:00:07.122254 | 00:00:04.543377 | 00:00:03.122562
pIDw0 | 2023-02-24 18:01:29.844692 | n | 0 | 00:00:05.375796 | 00:00:04.289109 | 00:00:03.95347
MjdMZ | 2023-02-24 18:07:20.308433 | n | 0 | 00:00:04.489905 | 00:00:04.354318 | 00:00:04.236918
Cz6Gl | 2023-02-24 18:13:32.535197 | n | 0 | 00:00:04.255905 | 00:00:04.033638 | 00:00:03.814545
dga69 | 2023-02-24 18:18:09.281297 | n | 0 | 00:00:07.373131 | 00:00:04.93187 | 00:00:04.128055
pCsny | 2023-02-24 18:26:21.088642 | n | 0 | 00:00:04.203148 | 00:00:03.981212 | 00:00:03.702442
GnLyz | 2023-02-24 18:39:44.874939 | n | 0 | 00:00:04.552793 | 00:00:04.399579 | 00:00:04.257074
lvmWk | 2023-02-24 18:50:59.088157 | n | 0 | 00:00:05.207047 | 00:00:04.702628 | 00:00:04.289306
cC1v9 | 2023-02-24 18:58:49.007792 | n | 0 | 00:00:05.844316 | 00:00:04.76548 | 00:00:03.953268
MBVwj | 2023-02-24 19:10:12.009875 | n | 0 | 00:00:05.663703 | 00:00:04.941265 | 00:00:04.649054
QoJ3w | 2023-02-24 19:16:48.52368 | n | 0 | 00:00:09.035593 | 00:00:05.742293 | 00:00:04.7567
pIDw0 | 2023-02-24 18:01:29.844692 | p | -1 | 00:00:02.955424 | 00:00:02.926266 | 00:00:02.896877
MjdMZ | 2023-02-24 18:07:20.308433 | p | -1 | 00:00:03.720681 | 00:00:03.093161 | 00:00:02.881999
Cz6Gl | 2023-02-24 18:13:32.535197 | p | -1 | 00:00:03.316626 | 00:00:03.068281 | 00:00:02.881514
dga69 | 2023-02-24 18:18:09.281297 | p | -1 | 00:00:03.104251 | 00:00:02.95971 | 00:00:02.904142
pCsny | 2023-02-24 18:26:21.088642 | p | -1 | 00:00:03.814585 | 00:00:03.195639 | 00:00:02.906804
GnLyz | 2023-02-24 18:39:44.874939 | p | -1 | 00:00:03.114035 | 00:00:02.981399 | 00:00:02.895505
lvmWk | 2023-02-24 18:50:59.088157 | p | -1 | 00:00:03.432149 | 00:00:03.061231 | 00:00:02.919551
cC1v9 | 2023-02-24 18:58:49.007792 | p | -1 | 00:00:03.363392 | 00:00:03.063311 | 00:00:02.870398
MBVwj | 2023-02-24 19:10:12.009875 | p | -1 | 00:00:03.255543 | 00:00:03.008074 | 00:00:02.893565
QoJ3w | 2023-02-24 19:16:48.52368 | p | -1 | 00:00:03.236271 | 00:00:03.049509 | 00:00:02.904692
(50 rows)
This needs a little explanation.
Each row is one benchmark.
Each benchmark has five iterations, and the max/avg/min are for all five.
Ten clusters were benchmarked. All in eu-west-1
,
benchmarked one after the other, with at most a few tens of seconds
pause between a cluster being shut down and the next being brought up
(although bring up usually takes a couple of minutes). The order of rows
is by benchmark type, then within the benchmark type, by cluster, in the
order the clusters were brought up.
(Usually I discard the fastest and slowest, but the point of this test is specifically to look to see how much variation there is.)
Here are the columns;
column | description |
---|---|
cluster_id | the first five letters of the AWS cluster ID |
bring_up_start_utc | well, what do you think? :-) |
bm_name | benchmark_name (dr = disk-read, drw = disk-read-write, n = network, p = processor) |
node_id | each cluster is 2xdc2.large
(so four nodes), for some benchmarks only one node produces a result,
for others half the nodes produce a result (note node -1 is the leader
node) |
max/av/min duration | what you think they are, all times in seconds |
The benchmarks are;
The PDF describing the benchmarks is here.
Without going into too much detail, because the numbers speak for themselves, we see a great deal of variation.
Particularly intriguing are the clusters “MjdMZ” and “MBVwj”, for the disk-read benchmark.
Their times are almost identical. I suspect I was allocated the same node both times.
We see also great variation between nodes in the same cluster for the disk-read-write benchmarks, where all nodes in the cluster produce a result.
Note though that in the bi-weekly benchmarks, all other node types
demonstrate a lot less variation. I need to run the test here on
ra3.xlplus
, and I also need to re-run this test but on a
single cluster, running the benchmarks ten times, to see how
much variation we get from a single cluster.
The normal-to-worst differences;
There seems to be a need for software to check for slow nodes, so you can make clusters until you have decent nodes.
(I implemented this a while ago, but it’s not published yet - I’ll add it to the Combobulator AMI.)
Regarding the Combobulator AMI (browser-based PII-safe Redshift management AMI), it’s ready for release, but I’ve just run into what you have to do to get into AWS AMI Marketplace, and it’s profoundly intrusive and bureaucratic, and it’s given me pause, and I’m thinking now about to get to market.
Home 3D Друк Blog Bring-Up Times Consultancy Cross-Region Benchmarks Email Forums IRC Mailing Lists Reddit Redshift Price Tracker Redshift Version Tracker Redshift Workbench System Table Tracker The Known Universe Twitter White Papers