i have column 4 options. column define text. table big table 100 millions of record , keep going. table use report table. index on table - provider_id,date,enum_field.
i wonder if should change enum_filed text int , how performance critical.
using postgres 9.1
table: provider_report: id bigserial not null, provider_id bigint, date timestamp without time zone, enum_field character varying, ....
index: provider_id,date,enum_field
tl;dr version: worrying not worth time.
long version:
there enum type in postgres:
create type myenum enum('foo', 'bar');
there pros , cons related using vs varchar or integer field. pros imho.
in terms of size, it's stored oid
, int32
type. makes smaller varchar populated typical values (e.g. 'draft'
, 'published'
, 'pending'
, 'completed'
, whatever enum about), , same size int
type. if you've few values, smallint
/ int16
admittedly smaller. of performance change come there (smaller vs larger field, i.e. negligible).
validation possible in each case, through built-in catalog lookup enum
, or check constraint or foreign key varchar
or int
. of performance change come there, , it'll not worth time either.
another benefit of enum type, is ordered. in above example, 'foo'::myenum < 'bar'::myenum'
, making possible order enumcol
. achieve same using varchar
or int
, you'll need separate table sortidx
column or something... in case, enum can yield enormous benefit if ever want order enum's values. brings (imho) gotcha, related how enum type stored in catalog...
internally, each enum's value carries oid
, , latter stored as is within table. it's technically int32. when create enum type, values stored in correct order within catalog. in above example, 'foo'
have oid
lower 'bar'
. makes efficient postgres order enum's value, since amounts sorting int32
values.
when alter
enum, however, may end in situation change order. instance, imagine alter above enum in such way myenum
('foo', 'baz', 'bar')
. reasons tied efficiency, postgres not assign new oid
existing values , rewrite tables use them, let alone invalidate cached query plans use them. instead, populate separate field in the pg_catalog
, make yield correct sort order. point forward, ordering enum field requires lookup, de facto amounts joining table separate values table carries sortidx
field -- varchar
or int
if ever wanted sort them.
this fine , acceptable. occasionally, it's not. when not there solution: alter tables enum type, , change values varchar. locate , adjust functions , triggers make use of do. drop type entirely, , recreate fresh oid values. , alter tables were, , readjust functions , triggers. not trivial, feasible.
Comments
Post a Comment