UNION, LIMIT and ORDER BY
12 Message(s) by 3 Author(s) originally posted in mysql database
| From: Israel |
Date: Friday, May 25, 2007
|
I have trying to
write a
query that seems like it should be simple but
for some reason my attempts aren't working. This is really a general
SQL quesion and does not pertain to MySQL but I could not find a generic
data base discussion
group except on on advancement and
theory and this
is really a basic query construction question.
Just say I have a
table with three columns, name,
date , score and
these represent the test scores for various people. Each person can
take the test as many times as they like but I only want to get back
the last result.
My first thought did not seem right but I figured I'd try it anyway:
SELECT name, date, score FROM testresults
GROUP BY name
ORDER BY date DESC
And yes this is wrong and doesn't return anything useful since the
score returned seems to be based on where the grouping started in
someway.
My second thought was that the
list of people is small and known so I
was going to just hardcode their names into the query:
SELECT name, date, score FROM testresults WHERE name = 'bob' ORDER BY
date DESC LIMIT 1
UNION ALL
SELECT name, date, score FROM testresults WHERE name = 'mary' ORDER BY
date DESC LIMIT 1
UNION ALL
SELECT name, date, score FROM testresults WHERE name = 'jim' ORDER BY
date DESC LIMIT 1
UNION ALL
This is syntactically incorrect.
Can anyone help me with this query?
| From: strawberry |
Date: Friday, May 25, 2007
|
wrote in
message :
I've trying to write a query that seems like it should be simple but
for some reason my attempts aren't working. This is really a general
SQL quesion and does not pertain to MySQL but I could not find a generic
database discussion group except on on advancement and theory and this
is really a basic query construction question.
Just say I have a table with three columns, name, date, score and
these represent the test scores for various people. Each person can
take the test as many times as they like but I only want to get back
the last result.
My first thought did not seem right but I figured I'd try it anyway:
SELECT name, date, score FROM testresults
GROUP BY name
ORDER BY date DESC
And yes this is wrong and doesn't return anything useful since the
score returned seems to be based on where the grouping started in
someway.
My second thought was that the list of people is small and known so I
was going to just hardcode their names into the query:
SELECT name, date, score FROM testresults WHERE name = 'bob' ORDER BY
date DESC LIMIT 1
UNION ALL
SELECT name, date, score FROM testresults WHERE name = 'mary' ORDER BY
date DESC LIMIT 1
UNION ALL
SELECT name, date, score FROM testresults WHERE name = 'jim' ORDER BY
date DESC LIMIT 1
UNION ALL
This is syntactically incorrect.
Can anyone help me with this query?
SELECT t1.* FROM tests t1
LEFT JOIN tests t2 ON t1.name = t2.name
AND t1.date > t2.date
WHERE t2.date IS NULL;
| From: Good Man |
Date: Monday, May 28, 2007
|
wrote in message in
@xxxxxxxxxxx:
I've trying to write a query that seems like it should be simple but
for some reason my attempts aren't working. This is really a general
SQL quesion and does not pertain to MySQL but I could not find a generic
database discussion group except on on advancement and theory and this
is really a basic query construction question.
Just say I have a table with three columns, name, date, score and
these represent the test scores for various people. Each person can
take the test as many times as they like but I only want to get back
the last result.
My first thought did not seem right but I figured I'd try it anyway:
SELECT name, date, score FROM testresults
GROUP BY name
ORDER BY date DESC
And yes this is wrong and doesn't return anything useful since the
score returned seems to be based on where the grouping started in
someway.
Ideally, you would've two tables here: one with people names (auto-
increment user id, user name) and one with test results (auto-increment
tableID, ID referencing the person, testdate, testresult). This'd
make this query, and others you are likely to create afterwards,
incredibly easier (though this is untested):
SELECT t.date,t.score,u.name
FROM testresults t
JOIN users u ON t.userID=u.userID
GROUP BY t.userID
ORDER BY t.date DESC
The
key to realizing there is a better way to
design your database is
any
replication of data: ie in your table, the name "Bob" is entered as
many times as he took the test. What if he called you up and said "Id
prefer my name be stored as Robert"? You'd have to go trawling through
this table, and any others, changing data when there is really no need
to do so - changing a single entry in the "users" table'd do it.
If you do not want to change your table design, I'd stick with your
query (minus the GROUP BY) and use a
programming language (ie: PHP) to
display the results correctly.
| From: Israel |
Date: Wednesday, May 30, 2007
|
wrote in message:
The key to realizing there is a better way to design your database is
any replication of data: ie in your table, the name "Bob" is entered as
many times as he took the test. What if he called you up and said "Id
prefer my name be stored as Robert"? You'd have to go trawling through
this table, and any others, changing data when there is really no need
to do so - changing a single entry in the "users" table'd do it.
My explanation of the tables was modified for brevity. There's
actually two tables (linked by FK) and the data is actually
parameter
data for the
state of some
hardware components on a
machine that get
logged at a regular interval but I wanted to create a query that'd
just show me the latest for all of the components at arbitrary times.
This query'd only list a handful of parameter types but there's on
the order of 100 parameter types all together and the parameter log
table keeps all of the
history for post-analysis and can have on the
order of 100k-200k
record s so
join ing a table back
onto itself isn't
really
option considering how slow it is.
Unless I'm missing something about the performance of joining in this
way I will probably have to resort to multiple independent queries -
hence my original thought of using UNION ALL which I could not seem to
get the
syntax correct for.
I was trying to avoid having to write any
code because we already have
plans in the works for making a front end
application that will allow
users to extract all sorts of data so I do not want to waste time
making a collection of hodge podge little apps or php code that
duplicates this work. Until we get the final application completed I
wanted to do everything via queries otherwise it'll be like pulling
teeth to get users to stop using all of the "temporary" applications
but in the interim people can still get the data they need.
| From: Israel |
Date: Wednesday, May 30, 2007
|
wrote in message:
SELECT t.date,t.score,u.name
FROM testresults t
JOIN users u ON t.userID=u.userID
GROUP BY t.userID
ORDER BY t.date DESC
With the ORDER BY after the GROUP BY I just end up with the final
groups ordered by the date from some (seemingly random)
row from each
group. What I need is for the first row picked for the group to be
the latest entry for that group but you can not
switch the ORDER BY and
GROUP BY clauses.
| From: strawberry |
Date: Wednesday, May 30, 2007
|
wrote in message:
wrote in message:
> SELECT t.date,t.score,u.name
> FROM testresults t
> JOIN users u ON t.userID=u.userID
> GROUP BY t.userID
> ORDER BY t.date DESC
With the ORDER BY after the GROUP BY I just end up with the final
groups ordered by the date from some (seemingly random) row from each
group. What I need is for the first row picked for the group to be
the latest entry for that group but you can not switch the ORDER BY and
GROUP BY clauses.
I can not see what's wrong with the query I posted previously - but if
you provide us with an example dataset and the result you'd
expect to
obtain from that, perhaps we can help. Ideally
include the CREATE and
INSERT statements too.
| From: Israel |
Date: Wednesday, May 30, 2007
|
wrote in message:
I can not see what's wrong with the query I posted previously - but if
you provide us with an example dataset and the result you'd expect to
obtain from that, perhaps we can help. Ideally include the CREATE and
INSERT statements too.
The issue I have is that it performs a join onto itself and with
100k-200k records it is way to slow.
| From: strawberry |
Date: Thursday, May 31, 2007
|
wrote in message:
wrote in message:
> I can not see what's wrong with the query I posted previously - but if
> you provide us with an example dataset and the result you'd expect to
> obtain from that, perhaps we can help. Ideally include the CREATE and
> INSERT statements too.
The issue I have is that it performs a join onto itself and with
100k-200k records it is way to slow.
Too slow? With appropriate
index es I'd imagine that it'd take
about 5 seconds. Is that too slow? Is 'GROUP BY' quicker!?!.
| From: Israel |
Date: Thursday, May 31, 2007
|
wrote in message:
Too slow? With appropriate indexes I'd imagine that it'd take
about 5 seconds. Is that too slow? Is 'GROUP BY' quicker!?!.
Ok, so I must be missing something then. This is my sql to create the
tables:
CREATE TABLE `parameter` (
`Id` int(10) unsigned NOT NULL auto_increment,
`Name` varchar(45) NOT NULL default '',
PRIMARY KEY (`Id`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `parameterlog` (
`ParameterId` int(10) unsigned NOT NULL default '0',
`SampleTimestamp` double NOT NULL default '0',
`Data` varchar(50) NOT NULL default '',
KEY `FK_parameterdata_Parameter` (`ParameterId`),
CONSTRAINT `FK_parameterdata_Parameter` FOREIGN KEY (`ParameterId`)
REFERENCES `parameter` (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The FK_parameterdata_Parameter creates a BTREE index on the
ParameterId
field so performing a join with that field with should as
quick as possible - right?
I ran the following query:
SELECT p1.* FROM `parameterlog` p1
LEFT JOIN `parameterlog` p2 ON p1.ParameterId = p2.ParameterId
AND p1.`Sampletimestamp` < p2.`Sampletimestamp`
WHERE p2.`Sampletimestamp` IS NULL;
On a smaller database with about 56,000 rows in this table it took 141
seconds and returned 73 rows. I did not have the patience to wait for
the query for the larger databases but I can only assume that it has
an
exponential growth.
I ran the following query (which does not produce the correct results):
SELECT * FROM parameterlog
GROUP BY ParameterId
ORDER BY SampleTimestamp
With the same record set of 56,000 rows this query executed in 0.21
seconds.
I'm using a 1.86GHz
Pentium laptop with 1GB
RAM and I'm running MySQL
server 4.1.
| From: strawberry |
Date: Thursday, May 31, 2007
|
wrote in message:
wrote in message:
> Too slow? With appropriate indexes I'd imagine that it'd take
> about 5 seconds. Is that too slow? Is 'GROUP BY' quicker!?!.
Ok, so I must be missing something then. This is my sql to create the
tables:
CREATE TABLE `parameter` (
`Id` int(10) unsigned NOT NULL auto_increment,
`Name` varchar(45) NOT NULL default '',
PRIMARY KEY (`Id`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `parameterlog` (
`ParameterId` int(10) unsigned NOT NULL default '0',
`SampleTimestamp` double NOT NULL default '0',
`Data` varchar(50) NOT NULL default '',
KEY `FK_parameterdata_Parameter` (`ParameterId`),
CONSTRAINT `FK_parameterdata_Parameter` FOREIGN KEY (`ParameterId`)
REFERENCES `parameter` (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The FK_parameterdata_Parameter creates a BTREE index on the
ParameterId field so performing a join with that field with should as
quick as possible - right?
I ran the following query:
SELECT p1.* FROM `parameterlog` p1
LEFT JOIN `parameterlog` p2 ON p1.ParameterId = p2.ParameterId
AND p1.`Sampletimestamp` < p2.`Sampletimestamp`
WHERE p2.`Sampletimestamp` IS NULL;
On a smaller database with about 56,000 rows in this table it took 141
seconds and returned 73 rows. I did not have the patience to wait for
the query for the larger databases but I can only assume that it has
an exponential growth.
I ran the following query (which does not produce the correct results):
SELECT * FROM parameterlog
GROUP BY ParameterId
ORDER BY SampleTimestamp
With the same record set of 56,000 rows this query executed in 0.21
seconds.
I'm using a 1.86GHz Pentium laptop with 1GB RAM and I'm running MySQL
server 4.1.
I'm not really familiar with this approach - but what happens if you
put an index on sampletimestamp?
| From: Israel |
Date: Thursday, May 31, 2007
|
wrote in message:
I'm not really familiar with this approach - but what happens if you
put an index on sampletimestamp?
Interestingly I tried that a few times and averaged 147 seconds and
then after I removed it the time averaged 154 seconds. The
load on my
system is roughly the same as it was the first time I ran the test so
I have no clue why it changed but running the query does peg out my
CPU at 100% for a while.
Without more exhaustive
testing I can not determine whether or not
putting an index on sampletimestamp had any appreciable effect.
| From: Good Man |
Date: Friday, June 01, 2007
|
wrote in message in
@xxxxxxxxxxx:> The FK_parameterdata_Parameter creates a BTREE index on the
ParameterId field so performing a join with that field with should as
quick as possible - right?
yeah, but according to your query, you're also performing a join on
'Sampletimestamp'... so those should be indexed as well
I ran the following query:
SELECT p1.* FROM `parameterlog` p1
LEFT JOIN `parameterlog` p2 ON p1.ParameterId = p2.ParameterId
AND p1.`Sampletimestamp` < p2.`Sampletimestamp`
WHERE p2.`Sampletimestamp` IS NULL;> SELECT * FROM parameterlog
GROUP BY ParameterId
ORDER BY SampleTimestamp
With the same record set of 56,000 rows this query executed in 0.21
seconds.
Of course it did. The query just says 'get everything' then group and order
by'. There's not much
processing power/time in getting 'everything'.
Perhaps if you add an index on 'Sampletimestamp' things will happen much
quicker... have you looked at the EXPLAIN result of your query?
Next Message: mySQL backup with SQLyog
Blogs related to UNION, LIMIT and ORDER BY
Implementing a queue in SQL (Postgres version)
greg=# DROP RULE fruitlimit ON q; DROP RULE greg=# CREATE FUNCTION fruitlimit() RETURNS TRIGGER LANGUAGE plpgsql AS $_$ DECLARE maxrows INT = 5; rows INT; BEGIN DELETE FROM q WHERE id NOT IN (SELECT id FROM q
ORDER BY id DESC
LIMIT ...
Re: Complicated UNION query
... works only if I remove GROUP BY and
ORDER BY and it > takes very long time to execute, I have a very large table that's why I > wanted to split it into small identical tables. I think
mysql doesn't > support joining tables in
union.
...
trying to narrow down my results a bit...
... wow_posts.bid, wow_posts.category, wow_posts.subject from wow_comment LEFT JOIN wow_posts on wow_comment.bid=wow_posts.bid
UNION SELECT postdate, user, bid, category, subject from wow_posts
ORDER BY postdate DESC
Limit 10
...
Re: different order in a union all clause
Note: >> On
MySQL, all i need to do is to use parentheses to enclose each >> select... >> (SELECT a FROM t1 WHERE a=10 AND B=1
ORDER BY a
LIMIT 10) >>
UNION >> (SELECT a FROM t2 WHERE a=11 AND B=2
ORDER BY a
LIMIT 10);--------------
...
Re: different order in a union all clause
Note: > On
MySQL, all i need to do is to use parentheses to enclose each select... > (SELECT a FROM t1 WHERE a=10 AND B=1
ORDER BY a
LIMIT 10) >
UNION > (SELECT a FROM t2 WHERE a=11 AND B=2
ORDER BY a
LIMIT 10);-------------- > Select c
...
[SQL and Code Injection] LIMIT and ORDER BY MySQL injection
It is a
LIMIT flaw and it could be exploited to do an
UNION SELECT query. No multiple queries are allowed, it seems... And the ';' char doesn't work. There's magic_quotes activated. Please help me, or at least try and let us know if you
...