[Psycopg] how to vacuum thru psycopg
Tim Roberts
timr at probo.com
Sat Jun 13 00:07:24 CEST 2009
Al Niessner wrote:
> On Fri, 2009-06-12 at 11:58 -0700, Tim Roberts wrote:
>
>> How are you determining whether the pair exists? Are you actually doing
>> a SELECT for each pair? That is, you're doing 44 SELECTs for every
>> picture before doing the INSERT? Since each one of those is a server
>> round trip, doesn't it seem obvious that this is where your performance
>> is being eaten up?
>
> Yes, I do agree that if I bundled them up it would speed things up.
> However, since there are only a fixed number of unique values and that
> set is smaller than 10000 and is probably less than 1000, I do not think
> that is responsible for the linear time growth.
>
Your response leaves my primary question unanswered. When you get a new
image, and you have 44 key/value pairs, how do you discover whether
those pairs already exist in the key/value table? Are you keeping a
Python dictionary in memory, or are you doing a SELECT for each of the
44 pairs? If you are doing a SELECT for each pair, then the number of
unique values is not relevent. The key is that you are doing 44 SELECT
calls per image. 10,000 images with 44 SELECTs is going to result in a
half-million database queries over a table that is getting progressively
larger.
If you're doing it with a Python dictionary, then it will still be
linear in time, but it will be a much shorter time.
--
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the Psycopg
mailing list