A question occurred to me earlier on today.
So we have STV_BLOCKLIST
, like so;
https://www.redshift-observatory.ch/system_table_tracker/1.0.69952/pg_catalog.stv_blocklist.html
This has one row per block (1mb on disk) per column, for all column-store tables.
In this table, we see minvalue
and
maxvalue
, which store the smallest and largest sorting
values - not actual values - per block.
(Sorting value is computed from actual value. Sorting value is what you’d expect about half the time. The other half, it ranges from “okay I can see that” through “you got that wrong” to “what drugs were you on at the time?”.)
Now, here’s the question which occurred to me : what do you get for a
sorting value when you have a NULL
?
Sorting value exists as the implementation of min-max culling, so you can know you don’t need to scan a given block for a given value (because the given value is outside of the range of values in that block).
NULL
however means “a value exists, but we don’t know
what it is”.
In SQL terms, it represents the entire value range of the data type for the column.
In min-max culling terms, it represents a single value - you want to
know if a block contains NULL
or not; a flag would be
enough, and you really don’t want to set the min-max range for an entire
block (easily million of rows) to the entire value range of the data
type, because there’s a single NULL
value.
So what do we see in STV_BLOCKLIST
?
Well, if we have a column which is int8
(where the
actual value is the sorting value - it’s one the data types which is
what you’d expect), if we have a block where there is one row and it is
NULL
, we get minvalue 9223372036854775807 and maxvalue
-9223372036854775808.
Yes, really. No, it’s not a cut’n’paste error.
In fact, I think those numbers to RS mean “empty block”. When you have a compound sorted table, and it is 100% sorted, you still have an unsorted segment, which is one block per column per slice, but it’s empty. That block has those numbers.
However, num_values
on the block in my test table, with
the NULL
row, says 1.
Then, if we insert a non-NULL value, what we find is the min and max
values are those of and only of the non-NULL values in the column.
There’s no sign of NULL
at all.
So my problem now is this : I have no idea how Redshift knows from
STV_BLOCKLIST
whether or not NULL
is present
in any given block.
Anyone know anything about this?
So, as mentioned, I could find no information in
STV_BLOCKLIST
indicating it knows about
NULL
.
It then occurred to me maybe it really doesn’t know about
NULL
, in which case it would need to scan everything -
which would be of profound significance, something users absolutely have
to know about.
I’ve set up and run a first quick check - table with two columns, one used for distribution (always the same value, so on one slice), the other an identity column. Bot encode raw. I wrote 1,309,940 rows.
I then issued
select count(*) from public.test_1 where column_1 is null;
and checked how many rows were materialized…
…and it’s all 1,309,940.
It’s one test only - I have to gym now - I could have slipped up,
I’ll need to repeat and confirm in other ways, but at this point,
earliest of early days, it looks like matching against NULL
means min-max culling is out the window. You read and materialize every
row in the table.
I thought I would write something about current state of play, as I’ve not written about Putin’s war for long time.
So, current state of play : the power of defence is enormous, and much greater than power of offence. Drones everywhere, complete observation of the battlefield, artillery systems tied into these observation systems. Any advance is horrendously expensive and minute.
Putin has thousands of people dying every week to advance a kilometer or so. Ukraine is huge - this is like inching your way to Alpha Centauri - and it will stay like this as long as Ukraine has a viable supply of artillery ammunition, drones, and enough troops to maintain a reasonable front line.
Right now Putin has a temporary boost from North Korean artillery supplies. It won’t last, and then Russia is back to what it can produce for itself, and, more critically, to the limited supply of new artillery barrels (they wear out, and pretty quickly when used a lot), and increasing production of those takes five plus years.
So where’s it all going now?
First, Putin is hoping for Trump to win, and Trump to cut off US aid to Ukraine. This is Trump’s amazing plan to end the war in one day. I may be wrong, but I think Trump has no idea what it means to run a country, rather than a business; he’s above his ceiling. Countries get their income from tax, they don’t need to worry about business competition, and they’re concerned about very fundamental and long-term concerns, all of which are costs; peace, justice, low taxes. This is why the US gives Ukraine 60 billion every six months or so; it is because it is directly combating the risk of Russia attacking NATO, and peace in Europe is central to American self-interest. IF Europe falls to Russia, American powerful as it is will be over-matched by China and Russia combined, and we need not dwell on the horrors and chaos of an invasion of Taiwan or Europe. Peace and justice is not only peace and justice in your own country.
As things stand, Trump will win. If Biden stands down, then there could be a chance for the Dems.
In any event, this will make life difficult for Ukraine, but it’s as likely to make them surrender as turn the Moon into cheese. The Europeans won’t stand for a surrender either - they know exactly what it means, because it’s on their doorstep and because by and large their leaders know what to worry about when running a country.
The Europeans have been fucking slow on the ball getting munition production up to speed. Hopefully they’re getting the idea now and getting on it with - it takes two or more years to really start ramping up production.
I think Europe can sustain Ukraine on its own, without USA.
Second, China is providing massive support to Russia and is really fundamental to Russian military production. Xi wants to get Taiwan, and this war suits him very well; America is distracted by Russia. At some point, if he thinks he can manage it, Xi will invade Taiwan. No doubt Xi is praying for Trump, too.
So that leave us with an ongoing Ukrainian wara with on end in sight; the power of the defence is simply too great.
I can’t see Ukraine surrendering / agreeing to a cease-fire (the same thing, as it means all territory held now by Putin is lost). Europe doesn’t want this either (well, barring Orban and Fico, but they’re both Putin’s men and everyone else is horrified by them.)
So I’m looking to a couple of things.
Firstly, a fundamentally new weapon - the modern invention of the tank - which is going to be AI in drones. This will be a game changer. If Ukraine gets this first, it will regain the power of offence and I think the Russian army will then be in serious trouble.
Secondly, Ukraine will develop its own nuclear capability (it’s working on it now, for sure).
The latter of course is fraught - Putin might launch a pre-emptive nuclear strike on Ukraine, and where does the world go from there?
In the background, elsewhere in the world, we have Russian and Iran co-operating, and Iran developing indirect war against Israel, to distract from Ukraine, and Russia providing military technology to North Korea and its nuclear and missile programmes, and we can expect NK to start threatening South Korea, as another distraction. We can look also to see China invading Taiwan, and, finally, once the Ukraine war is over, however it ends, Russia having a go at attacking and shattering NATO, if the auguries are promising (Trump in power, China invading Taiwan).
Remember here that NATO without the USA is much weaker than Russia.
So in general, the longer the war goes on, the ever more likely it becomes something somewhere slips up, and a really serious problem inadvertently develops.
Europe frankly needs to get off its ass and make Ukraine win, and as soon as possible. This is now the only safe way forward.
There was one other matter in my thought but which I did not cover in the previous post.
I keep an eye on the news about Ukraine.
I have to say there are a couple of Western publications, which DuckDuckGo seems to like to list, which to my eye are producing propaganda, presumably tailored for their readership. I have in mind here The New York Times, Politico, and BusinessInsider.
Whatever is going on at the time is taken, and then all knowledge, understanding, insight and grasp of reality set firmly aside, and whatever is going on is vaguely related to the published story, which takes that, twists it into a parody, wholly setting aside even the most basic understanding of war or what is actually going on, with the purpose of conveying whatever it is the publication wants to convey. There’s no truth, accuracy, realism, insight, or anything like that.
I lived in Kyiv for about a year, the first year of the war, and I’ve been reading military history, and history, for decades. I have some idea of how these things go, and what it’s like there now. What’s being printed in those publication about Ukraine is a fantasy, deliberately twisted to a particular premeditated end, whatever that is, for whatever reason it is being done.
https://docs.aws.amazon.com/redshift/latest/dg/r_VACUUM_command.html
A vacuum operation might not be able to start if a load or insert operation is already in progress. Vacuum operations temporarily require exclusive access to tables in order to start. This exclusive access is required briefly, so vacuum operations don’t block concurrent loads and inserts for any significant period of time.
As far as I can tell, by experimenting and by near-real-time monitoring of locks (200ms polling), VACUUM does not take an exclusive lock, and neither blocks, nor is blocked by, INSERT or SELECT. I’ve not tested properly with DELETE or UPDATE (but they both open with the same lock as INSERT). I’ve not tested with COPY.
Clusters in region eu-north-1
look to have been
substantially downgraded.
https://www.redshift-observatory.ch/cross_region_benchmarks/index.html
It might be this affects only new clusters, and existing clusters are unchanged. I don’t know.
I’ve been investigating ANALYZE
.
This is not a completely thorough investigation - this is work being done for immediate use with a client - but there’s a number of fascinating findings.
To begin with, it’s become clear the table stats generated analyze really matters. I always thought it must, but I’d never looked for or seen evidence - where-as without looking for it I have seen such evidence working with Postgres. With Redshift, I had taken a set of tables, and a query run on them, put copies into my schema, and was running the query on those tables. I was getting a 3.5 billion byte distribute step, which the query was not performing on the original tables. I ran analyze on everything, and that step went away.
Now for a collection of facts which I’ve discovered.
When you create a table, it has no table stats at all. They are not initialized to empty - they are missing.
If you perform a classic resize, table stats are wiped on all tables. This is astonishing, given that the number of rows has not changed - I think it just means the information is not being brought over. Given that table stats matter, and that running analyze on every table in a large busy cluster can take as much as a week, this is a major (and as ever, completely undocumented) issue.
The column pg_tables.reltuples
is not maintained
when insert/delete occurs. It is - I presume - updated at certain times
and by certain events, I know not what. When I compare it to the table
row counts from stv_blocklist
, about a third of tables are
correct, a third are medium wrong, a third are completely wrong (10x
wrong).
The table pg_statistic_indicator
also has a count of
the number of rows in tables. This also is not maintained, and is wrong
in much the same way as reltuples
. The docs say “updated
frequently”, and I don’t buy it - it’s too wrong.
The table pg_statistic_indicator
has two more
columns, count of insert and delete. These - compared to totals from
STL_INSERT
and STL_DELETE
- normally (but not
always) look very accurate, not completely, but very, certainly
good enough to use. I have to say normally, because I’m looking now at a
table where STL_DELETE
says 1,219,526 but
pg_statistic_indicator
says 707,524.
(I’ve just found another, where STL_INSERT
says
2,904,447,040, and pg_statistic_indicator
says
1,766,001,280.)
One question for me though is what happens with these columns with
regard to insert, delete and update. Insert I expect to increase the
insert total, delete I expect to increase the delete total, what does
update do? increase both? I would think so, but I’ve not checked, and
I’m currently looking at some output which shows what to me looks like
an update, which gave me 128 rows inserted in STL_INSERT
and 128 rows deleted in STL_DELETE
, has only given me 128
rows deleted in pg_statistic_indicator
. However,
I’m not specifically checked.
As an aside, this is not a Postgres table - in Postgres, this table
is pg_statistic
. This is a similarly named
Redshift-originated table.
Auto-Analyze when it issued an analyze sets
analyze-threahold
to 90 - which is to say, Redshift needs
to think a table is 90% different to how it was when last analyzed, for
analyze to run. On a large cluster, 80k tables, looking back for a week,
auto-analyze ran 24 times - on 80,000 tables - and 22 of those 24
skipped because threshold 90. Auto-analyze looks like a complete and
total waste of time.
Analyze updates pg_class.reltuples
.
I still have no idea when you issue ANALYZE;
(which
will analyze every table) in what order tables are analyzed.
Elastic resize leaves table stats intact.
So, what I wanted to do was write a Python script which would analyze tables needing analyze, and do them in an order which I cared about.
To begin with, I replied on pg_class.reltuples
and
pg_statistic_indiactor
. I had found that if
reltuples
was a very small value (it’s a
float8
), between about 0.001 and 0.00100000005, table stats
were considered missing (EXPLAIN
will tell you this if you
run a query on the table). I also found tables could be missing from
pg_statistic_indicator
, which to my eye also means no
stats. Finally, I took it that if reltuples
and the row
count in pg_statistic_indicator
were equal, and
pg_statistic_indicator
indicated zero rows inserted or
deleted since the last analyze, then table stats were good - and if not,
stale.
Having been investigating today, I realized both row counts are not
reliable and only become correct after ANALYZE
, which makes
it rather difficult to use them to decide whether or not to issue
ANALYZE
:-)
So what I’m doing now is working from first principles.
I’m using STV_BLOCKLIST
to get the actual row counts in
tables, looking at STL_ANALYZE
to get the most recent full
analyze, looking at STL_INSERT
and STL_DELETE
(and using the most recent analyze timestamp) to figure out count of
inserts and deletes since the most recent analyze (update will add to
both totals), and then if the current row count is sufficiently
different (according to my current criteria) from how it was when
analyze was last issued, I’ll issue it again.
This approach also means I easily can ignore empty tables, which I
want to do, since there are 80k tables but only about 32k have blocks in
STV_BLOCKLIST
.
So the upshot of all this - auto-analyze does nothing, and you can’t
use pg_statistics_indicator
to figure out when to analyze,
so you have to make your own sub-system to run analyze.
I’ve just made a quick off-the-cuff test of the boost
option to VACUUM
.
https://docs.aws.amazon.com/redshift/latest/dg/r_VACUUM_command.html
Runs the VACUUM command with additional resources, such as memory and disk space, as they’re available. With the BOOST option, VACUUM operates in one window and blocks concurrent deletes and updates for the duration of the VACUUM operation. Running with the BOOST option contends for system resources, which might affect query performance.
I tested on a two node dc2.large
, 130,994,000 rows on
each slice (two columns, int2
and int8
), the
int2
controls distribution, sorting is on the
int8
, which is identity(1,1)
.
I issued VACUUM FULL [table] TO 100 PERCENT
, with and
without BOOST
, a number of times (unsorting the table
between runs with
update test_1 set column_0 = column_0;
).
Performance was unchanged.
I manully checked locking (did not monitor in real-time), and the
locks on the table looked unchanged to me - there was no
ShareRowExclusiveLock
, which would be needed to block other
writes.
So, all I can say is, it looked like boost was not actually happening. Locks were unchanged, perform unchanged. Maybe I did something wrong?
Okay. Weird stuff. I’ve been having an ad hoc look at the impact of elastic downsizing on slices and slice memory. I started with 4 nodes of dc2.large and the nodes will have 2 slices each, then elastic downsized to 2 nodes, each node now having 4 slices. What I expected, from what I saw with Serverless, is that each node would continue to run only the number of slices given in the node’s specification. What I see is that each node really is running all slices concurrently, which on the face of it is crazy - each slice now has half the memory it used to have.
Serverless uses ra3 type nodes. Maybe they behave differently.
Oooof. Bingo - although ad hoc so I could be blundering. Looks like dc2 on elastic downsize actually runs all slices concurrently (so you have proportionately less memory per slice), where-as ra3 runs only the specification number of slices concurrently (and has each slice read the data of the other slices). I need to do this test properly, with a repeatable test script - it’s too complex to be sure I’m not messing things up.
Home 3D Друк Blog Bring-Up Times Consultancy Cross-Region Benchmarks Email Forums IRC Mailing Lists Reddit Redshift Price Tracker Redshift Version Tracker Redshift Workbench System Table Tracker The Known Universe Twitter White Papers