Database Text Searching using Triggers, TSearch2, Materialized Views and User Security - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Database Text Searching using Triggers, TSearch2, Materialized Views and User Security

Description:

The Statistical Center for HIV/AIDS Research & Prevention (SCHARP) is part of ... SCHARP provides statistical collaboration to infectious disease researchers ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 41
Provided by: Loer4
Category:

less

Transcript and Presenter's Notes

Title: Database Text Searching using Triggers, TSearch2, Materialized Views and User Security


1
Database Text Searching using Triggers,
TSearch2,Materialized Viewsand User Security
  • By Lloyd Albin

2
SCHARPThe Statistical Center for HIV/AIDS
Research Prevention
  • The Statistical Center for HIV/AIDS Research
    Prevention (SCHARP) is part of the program of
    Population Sciences within the Vaccine and
    Infectious Disease Institute (VIDI) of the Fred
    Hutchinson Cancer Research Center in Seattle,
    Washington. SCHARP provides statistical
    collaboration to infectious disease researchers
    around the world and conducts a complementary
    program of statistical methodology, and
    mathematical modeling research. SCHARP also
    collects, manages, and analyzes data from
    clinical trials and epidemiological studies
    dedicated to the elimination of infectious
    disease as a threat to human health.
  • Formerly known as HIVNET, SCHARP has been in
    existence since 1994. We currently serve as the
    Statistical and Data Management Center for the
    HIV Prevention Trials Network (HPTN), the HIV
    Vaccine Trials Network (HVTN), and the
    Microbicide Trials Network (MTN). The HVTN, HPTN,
    and MTN are part of the Division of AIDS (DAIDS)
    of the National Institute of Allergy and
    Infectious Diseases (NIAID), a component of the
    National Institutes of Health (NIH). SCHARP also
    collaborates with other organizations, including
    the Chinese Center for Disease Control and
    Prevention, the Center for HIV-AIDS Vaccine
    Immunology (CHAVI), the Partnership for AIDS
    Vaccine Evaluation (PAVE), and the International
    Maternal Pediatric Adolescent AIDS Clinical
    Trials (IMPAACT).
  • SCHARP conducts clinical trials in over 19
    countries around the globe, including several
    Sub-Saharan African and Caribbean countries,
    India, Russia, China, Thailand, Brazil, and Peru.

3
The Project
  • Create a search engine to search the database and
    return relevant results without knowing what
    tables or columns to search.
  • Return result that the logged in user has select
    (read) access to.

4
Other Issues
  • Joining all the tables and search all the columns
    can take a very long time
  • The Tsearch2 encoder is not fast.
  • With both of these issues, it caused me to use
    the materialized view, so that all the encoding
    was already done.

5
The Solution
  • Add triggers to the tables that we wish to be
    searched and have them update a materialized view
    that contains the GIS version of the data created
    by TSearch2
  • Create a function that searches this materialized
    view and returns the results.
  • Now go back to that function and add in User
    security, so that the user only sees data that
    they have permission to view.

6
Install TSearch2
  • Due to the fact that we are running Solaris 9
    with PostgreSQL v8.2.x we had to install
    TSearch2.
  • We are have just completed upgrading to Solaris
    10 and are planning to upgrade to PostgreSQL
    v8.3.x in October and will be revising these
    tools at that point.
  • Also an issue is that we are running the 64 bit
    version of PostgreSQL and this makes the upgrades
    harder.
  • To Install Tsearch2
  • psql -h server_name -U postgres -d database_name

7
The Materialized View
  • CREATE TABLE "tools"."search_table" (
  • "search_id" SERIAL,
  • "row_id" TEXT,
  • "column_name" NAME,
  • "table_name" NAME,
  • "schema_name" NAME,
  • "database_name" NAME,
  • "server_name" NAME,
  • "vector" TSVECTOR,
  • "description" TEXT,
  • CONSTRAINT "search_table_pkey" PRIMARY
    KEY("search_id")
  • ) WITH OIDS
  • CREATE INDEX "search_table_index" ON
    "tools"."search_table"
  • USING gist ("vector" "public"."gist_tsvector_ops
    ")

8
The Trigger
  • CREATE TABLE "al"."list_ldms_additive_code" (
  • "pk" SERIAL,
  • "ldms_additive_code" VARCHAR(255),
  • "additive_description" TEXT,
  • "info_source" VARCHAR(255),
  • "notes" TEXT,
  • CONSTRAINT "list_ldms_additive_code_ldms_additiv
    e_code_key" UNIQUE("ldms_additive_code"),
  • CONSTRAINT "list_ldms_additive_code_pkey"
    PRIMARY KEY("pk")
  • ) WITH OIDS
  • CREATE TRIGGER "search_list_ldms_additive_code_tr"
    AFTER INSERT OR UPDATE OR DELETE
  • ON "al"."list_ldms_additive_code" FOR EACH ROW
  • EXECUTE PROCEDURE "tools"."update_search"(atlassql
    -a, pk)

9
Search Function Examples
  • Search Function
  • SELECT FROM tools.database_search
    (search_string, server_name, database_name,
    schema_name, table_name, column_name)
  • Search Parameters
  • search_string is mandatory and is really the only
    item needed. If you just fill in this one without
    filling in any other items, the the entire
    database will be searched for the text string.
  • server_name is optional and really reserved for
    future use. (If you are using schema_name, you
    may also wish to specify server_name and
    database_name also to have your code pre-written
    for the future.)
  • database_name is optional and really reserved for
    future use. (If you are using schema_name, you
    may also wish to specify server_name and
    database_name also to have your code pre-written
    for the future.)
  • schema_name is optional. This allows you to
    search a specific schema instead of the entire
    database.
  • table_name is optional. This allows you to search
    a specific table. Without specifying the
    schema_name it will search all tables with that
    table name across in the entire database.
  • column_name is optional. This allows you to
    search a specific column. Without specifying the
    table_name or schema_name it will search all
    tables across in the entire database that use
    that column_name.
  • The server_name, database_name, schema_name,
    table_name, and column_name allow you to perform
    a WHERE on the results before the search is done
    instead of afterwards. This makes the search
    function much faster as it does not need to
    search all those extra results.
  • Search Strings Commands
  • and
  • or
  • ! not
  • () grouping
  • Search String Examples
  • yellow
  • grass green
  • sun (yellow red)

10
Search Function Examples
  • Search String Results Examples
  • SELECT FROM tools.database_search ('yellow',
    '', '', '', '', '')

SELECT FROM tools.database_search ('grass
green', '', '', '', '', '')
11
Search Function Examples
SELECT FROM tools.database_search ('sun
(yellow red)', '', '', '', '', '')
12
Search Function Examples
SELECT FROM tools.database_search ('sun
!red)', '', '', '', '', '')
  • Bugs / Notes
  • If your key field data contains a ' the response
    may not be escaped and so therefore may be
    unusable. This needs to be verified and then
    fixed if this is the case.
  • This function does not understand when you switch
    roles. This is because the function needs to run
    with its own permissions and this will skew the
    results if it was using the role permissions.
  • Security Note
  • You will only see the responses for tables that
    you have SELECT permissions.

13
Bulk Update(Add an entire table)
  • Insert Search Data Function
  • SELECT tools.insert_search_data ('server_name,
    schema_name, table_name')
  • SELECT tools.insert_search_data ('server_name,
    schema_name, table_name, ignore_column_name,
    ...')
  • Insert Search Data Parameters
  • server_name The name of the server that that
    the table resides in.
  • schema_name The name of the schema that that
    the table resides in.
  • table_name The name of the table that you want
    to be searchable.
  • ignore_column_name The name of the column in
    the table that you do not wish to be searchable,
    such as personal data, payroll data, id keys,
    etc.
  • Search String Examples
  • SELECT tools.insert_search_data('atlassql-test,
    public, testing') SELECT tools.insert_search_dat
    a('atlassql-test, public, testing, key_id,
    not_string')
  • Notes
  • If you run this command more than once on the
    same table, you will get duplicate results. You
    must run the drop_search_data function between
    each insert_search_data function.
  • If any records are inserted, updated, or deleted
    while this is happening you may have some
    integrity issues.
  • If you are dropping and re-creating the
    view/tables, it is better to drop the table, run
    the drop_search_data function, then create the
    table, add the trigger and finally insert the
    data. This way there will not be any data
    integrity issues.
  • This function is really just designed for doing
    the initial inserts for an existing table that
    the trigger is being added to.

14
Bulk Delete(Entire table or more at once)
  • Drops Search Data Function
  • SELECT tools.drop_search_data (server_name,
    database_name, schema_name, table_name)
  • Drop Search Data Parameters
  • server_name is optional and really reserved for
    future use. (If you are using schema_name, you
    may also wish to specify server_name and
    database_name also to have your code pre-written
    for the future.)
  • database_name is optional and really reserved for
    future use. (If you are using schema_name, you
    may also wish to specify server_name and
    database_name also to have your code pre-written
    for the future.)
  • schema_name is optional. This allows you to
    search a specific schema instead of the entire
    database.
  • table_name is optional. This allows you to search
    a specific table. Without specifying the
    schema_name it will search all tables with that
    table name across in the entire database.
  • You must enter one of the above or nothing will
    be processed
  • Search String Examples
  • SELECT tools.drop_search_data ('atlassql-test',
    'atlastest', 'public', 'testing')
  • SELECT tools.drop_search_data ('atlassql-test',
    'atlastest', 'public', '')
  • SELECT tools.drop_search_data ('atlassql-test',
    'atlastest', '', '')

15
Lookup Role Users
  • User Search Function
  • This function will return to you a list of
    everybody who is attached to a role (user or
    group) that has inherit privileges turned on.
  • SELECT FROM tools.lookup_role_users('role_name')
  • Search Parameters
  • role_name
  • This is either a user or group name
  • Search String Examples
  • SELECT FROM tools.lookup_role_users('dba')
  • SELECT FROM tools.lookup_role_users('lalbin')
  • Search String Results Examples
  • SELECT FROM tools.lookup_role_users('dba')
  • SELECT FROM tools.lookup_role_users('lalbin')

16
Lookup User Permissions
  • User Privilege Search Function
  • This function returns to you a list of the items
    and the users who have the privilege type
    requested. This function does drill down through
    the groups also to find additional users why have
    inherit privileges turned on. You also must use
    the DISTINCT because this function returns
    duplicates because of the drill down that it
    performs.
  • SELECT DISTINCT FROM tools.display_user_perms('i
    tem_type','privilege_type')
  • Search Parameters
  • item_type
  • r ordinary table
  • i index
  • S sequence
  • v view
  • c composite type
  • s special
  • t TOAST table
  • privilege_type
  • r SELECT ("read")
  • w UPDATE ("write")
  • a INSERT ("append")
  • d DELETE
  • R RULE

17
Lookup User Permissions
  • Search String Examples
  • SELECT DISTINCT FROM tools.display_user_perms('r
    ','r') WHERE nspname 'public'
  • SELECT DISTINCT FROM tools.display_user_perms('r
    ','r') WHERE rolname 'lalbin'
  • SELECT DISTINCT FROM tools.display_user_perms('r
    ','r') WHERE relname 'testing'
  • SELECT DISTINCT FROM tools.display_user_perms('v
    ','r') WHERE rolname 'lalbin'
  • SELECT DISTINCT FROM tools.display_user_perms('r
    ','w') WHERE rolname 'lalbin'
  • Search String Results Examples
  • SELECT DISTINCT FROM tools.display_user_perms('r
    ','r') WHERE nspname 'public'

18
Lookup User Permissions
  • SELECT DISTINCT FROM tools.display_user_perms('r
    ','r') WHERE rolname 'lalbin'
  • Security Note
  • This function does not understand when you switch
    roles. This is because the function needs to run
    with its own permissions and this will skew the
    results if it was using the role permissions.
  • This function will only return results from the
    database that you are currently logged into.

19
Foreign Key Relationships(Crabbing or Crawling
the Data sideways)
  • Foreign Key Relationship Search Function
  • SELECT FROM tools.foreign_key_details
    (schema_name, table_name, direction)
  • Foreign Key Relationship Parameters
  • schema_name The name of the schema that that
    the table resides in.
  • table_name The name of the table that you want
    to be searchable.
  • direction 0 - returns tables that are
    referenced by this data, 1 - returns tables that
    are reference by this data.
  • Search Foreign Key Relationship Examples
  • SELECT FROM tools.foreign_key_details('public',
    'testing, 0)
  • Search Foreign Key Relationship Results Examples
  • SELECT FROM tools.foreign_key_details ('al',
    'list_antibody, 0)
  • Security Note
  • The function knows which role you are logged in
    as and will return the appropriate results.

20
update_search()
  • CREATE OR REPLACE FUNCTION "tools"."update_search"
    () RETURNS trigger AS
  • body
  • Half of this code was taken from the shadow_fn
  • If shadow_fn has been updated you may wish to
  • also apply those updates here.
  • my updates_table 0
  • my database_name ""
  • my server_name
  • if (length(_0) 0)
  • server_name _0
  • else
  • server_name "atlassql-test"
  • if the search_table_updates table doesn't
    exist then error

21
update_search()
  • get the primary key values for the row
  • sql "SELECT column_name FROM
    information_schema.key_column_usage "
  • ."WHERE table_schema'"._TD-table_schem
    a."' "
  • ."AND table_name'"._TD-table_name."'
    "
  • ."AND position_in_unique_constraint IS
    NULL "
  • ."ORDER BY ordinal_position"
  • rv spi_exec_query(sql)
  • join primary key fields into one big field
  • my oldnew _TD-event eq "DELETE" ?
    "old" "new"
  • my _at_pkey_vals map vals_-column_name
    oldnew _at_rv-rows
  • my _at_pkey_cols map _-column_name
    _at_rv-rows
  • my pkey join(",",_at_pkey_vals)
  • my pkey join("''",_at_pkey_cols,_at_pkey_vals)
  • pkey pkey . "''"
  • my pkey ""
  • while (_at_pkey_vals)
  • if (length(pkey))

22
update_search()
  • insert for each changed field
  • foreach my c (_at_cols)
  • process_column 0
  • update only the changed columns
  • if (valscold ne valscnew)
  • process_column 1
  • if UPDATE and the keys have changed, update
    all columns
  • if (_TD-event eq "UPDATE")
  • if (pkey ne pkey_update)
  • process_column 1

23
update_search()
  • if (_TD-event eq "INSERT")
  • sql "INSERT INTO
    tools.search_table "
  • ."(row_id, column_name,
    table_name, schema_name, "
  • ."database_name, server_name,
    vector, description) "
  • ."VALUES ('pkey', 'c',
    '"._TD-table_name."', "
  • ."'"._TD-table_schema."',
    current_database(), 'server_name', "
  • ."to_tsvector('".valscnew
    ."'), '".valscnew."')"
  • spi_exec_query(sql)
  • elsif (_TD-event eq "UPDATE")
  • sql "UPDATE tools.search_table
    SET "
  • ."row_id 'pkey',
    column_name 'c', "
  • ."table_name
    '"._TD-table_name."', schema_name
    '"._TD-table_schema."', "
  • ."database_name
    current_database(), server_name 'server_name',
    "
  • ."vector to_tsvector('".val
    scnew."'), description '".valscnew."'
    "
  • ."WHERE "
  • ."row_id 'pkey_update' AND
    column_name 'c' AND "

24
database_search(search text, server, database,
schema, table, column)
  • CREATE OR REPLACE FUNCTION "tools"."database_searc
    h" (text, text, text, text, text, text) RETURNS
    SETOF "tools"."schema_search_t" AS
  • body
  • my row
  • my search_config ""
  • if (length(_1) 0)
  • if (length(search_config) 0)
  • search_config search_config . " AND
    server_name '" . _1 . "'"
  • else
  • search_config " WHERE server_name '" .
    _1 . "'"
  • if (length(_2) 0)

25
database_search(search text, server, database,
schema, table, column)
  • if (length(_4) 0)
  • if (length(search_config) 0)
  • search_config search_config . " AND
    table_name '" . _4 . "'"
  • else
  • search_config " WHERE table_name '" .
    _4 . "'"
  • if (length(_5) 0)
  • if (length(search_config) 0)
  • search_config search_config . " AND
    column_name '" . _5 . "'"

26
database_search(search text, server, database,
schema, table, column)
  • my sth spi_query("SELECT m.search_id, "
  • ."headline(m.description, q) As headline, "
  • ."rank(m.vector, q) as rank, m.row_id, "
  • ."m.column_name, m.table_name, "
  • ."m.schema_name, m.database_name, "
  • ."m.server_name "
  • ."FROM (SELECT FROM "
  • ."(SELECT FROM tools.search_table
    ".search_config.") AS t "
  • ."LEFT JOIN (SELECT DISTINCT from
    tools.display_user_perms('r', 'r') "
  • ."WHERE rolname SESSION_USER) AS u "
  • ."ON u.relname t.table_name AND u.nspname
    t.schema_name "
  • ."WHERE u.rolname IS NOT NULL) AS m, "
  • ."to_tsquery('_0') AS q "
  • ."WHERE vector _at__at_ q "
  • ."ORDER BY rank(vector, q) DESC")
  • while (defined (row spi_fetchrow(sth)))
  • return_next(

27
display_user_perms(item type, access type)
  • CREATE OR REPLACE FUNCTION "tools"."display_user_p
    erms" (text, text) RETURNS SETOF
    "tools"."user_perms" AS
  • body
  • First Value Passed
  • r ordinary table
  • i index
  • S sequence
  • v view
  • c composite type
  • s special
  • t TOAST table
  • Second Value Passed
  • r -- SELECT ("read")
  • w -- UPDATE ("write")
  • a -- INSERT ("append")
  • d -- DELETE
  • R -- RULE
  • x -- REFERENCES
  • t -- TRIGGER

28
display_user_perms(item type, access type)
  • my who_user spi_fetchrow(who_user_table)
  • my sth spi_query("SELECT current_database()
    AS
  • catname, nc.nspname,
  • c.relname, a.rolname
  • FROM pg_catalog.pg_class AS c
  • LEFT JOIN pg_catalog.pg_authid AS a ON a.oid
    c.relowner
  • LEFT JOIN pg_catalog.pg_namespace AS nc ON
    nc.oid c.relnamespace
  • WHERE c.relkind '_0' AND nc.nspname !
    'pg_'
  • AND c.relacl IS NULL AND (NOT
    pg_is_other_temp_schema(nc.oid))
  • ORDER BY c.relname")
  • while (defined (row spi_fetchrow(sth)))
  • if (length(row-rolname) 0)
  • user login
  • user_rtn who_user-session_user
  • user via login then changed by set
    role
  • user_rtn who_user-current_user

29
display_user_perms(item type, access type)
  • if (length(row3-rolname) 0)
  • user login
  • user_rtn who_user-session_user
  • user via login then changed by set
    role
  • user_rtn who_user-current_user
  • else
  • user_rtn row3-rolname
  • return_next(
  • catname row-catname,
  • nspname row-nspname,
  • relname row-relname,
  • rolname user_rtn
  • )

30
display_user_perms(item type, access type)
  • while (defined (row spi_fetchrow(sth)))
  • postgresarwdxt/postgres,main_ror/post
    gres
  • xxxx -- privileges granted to
    PUBLIC
  • unamexxxx -- privileges granted to a
    user
  • group gnamexxxx -- privileges granted to a
    group
  • Trim first and last characters ""
  • create array at ","
  • _at_users_privs split(/,/,
    substr(row-relacl,1,-1))
  • my user_privlages
  • Loop through all the users
  • foreach user_privlages (_at_users_privs)
  • if (substr(user_privlages,0,5) eq
    "group")

31
display_user_perms(item type, access type)
  • user_rtn user_privs0
  • return_next(
  • catname row-catname,
  • nspname row-nspname,
  • relname row-relname,
  • rolname user_rtn
  • )
  • my sth2 spi_query("SELECT FROM
    tools.lookup_role_users('row-rolname')")
  • while (defined (row2 spi_fetchrow(sth2)))
  • elog(INFO, row2-rolname . "" .
    user_privs1)
  • if (length(row2-rolname) 0)
  • user login
  • user_rtn who_user-session_user
  • user via login then changed by
    set role

32
lookup_role_users (name)
  • CREATE OR REPLACE FUNCTION "tools"."lookup_role_us
    ers" (name) RETURNS SETOF "tools"."lookup_role_use
    rs" AS
  • body
  • my row
  • my row2
  • my sth spi_query("select u.rolname,
    u.rolinherit
  • FROM pg_auth_members LEFT JOIN pg_authid u
  • ON pg_auth_members.member u.oid
  • LEFT JOIN pg_authid r
  • ON pg_auth_members.roleid r.oid
  • WHERE r.rolname '_0'
  • AND u.rolinherit TRUE")
  • while (defined (row spi_fetchrow(sth)))
  • moved to be part of the above query
  • if (row-rolinherit eq 't')
  • return_next(
  • rolname row-rolname

33
foreign_key_details(schema name, table name,
direction)
  • CREATE OR REPLACE FUNCTION "tools"."foreign_key_de
    tails" (text, text, integer) RETURNS SETOF
    "tools"."foreign_key_details_t" AS
  • body
  • my sql
  • my rv2
  • my row2
  • elog(NOTICE,"ROW START")
  • sql "SELECT tc.constraint_name, "
  • ."tc.constraint_type, "
  • ."tc.table_name, "
  • ."kcu.column_name, "
  • ."tc.is_deferrable, "
  • ."tc.initially_deferred, "
  • ."rc.match_option AS match_type, "
  • ."rc.update_rule AS on_update, "
  • ."rc.delete_rule AS on_delete, "
  • ."ccu.table_schema AS
    references_schema, "
  • ."ccu.table_name AS references_table, "
  • ."ccu.column_name AS references_field "

34
foreign_key_details(schema name, table name,
direction)
  • if (_2 0)
  • sql sql . " AND tc.table_name
    '"._1."' "
  • ."AND tc.table_schema '"._0."' "
  • else
  • sql sql . " AND ccu.table_name
    '"._1."' "
  • ."AND ccu.table_schema '"._0."' "
  • rv2 spi_query(sql)
  • while (defined (row2 spi_fetchrow(rv2)))
  • return_next(
  • constraint_name row2-constraint_name,

35
user_perms
  • CREATE TYPE "tools"."user_perms" AS (
  • "catname" NAME,
  • "nspname" NAME,
  • "relname" NAME,
  • "rolname" NAME
  • )

36
lookup_role_users
  • CREATE TYPE "tools"."lookup_role_users" AS (
  • "rolname" NAME
  • )

37
schema_search_t
  • CREATE TYPE "tools"."schema_search_t" AS (
  • "search_id" INTEGER,
  • "headline" TEXT,
  • "rank" REAL,
  • "row_id" TEXT,
  • "column_name" NAME,
  • "table_name" NAME,
  • "schema_name" NAME,
  • "database_name" NAME,
  • "server_name" NAME
  • )

38
foreign_key_details_t
  • CREATE TYPE "tools"."foreign_key_details_t" AS (
  • "constraint_name" NAME,
  • "table_name" NAME,
  • "column_name" NAME,
  • "references_schema" NAME,
  • "references_table" NAME,
  • "references_field" NAME
  • )

39
Future Enhancements
  • Finish the code so that it fully supports
    multi-databases and multi-servers
  • Update the code for the built in TSearch2 in
    PostgreSQL v8.3.x

40
Additional Possible Talks
  • 64 Bit PostgreSQL Windows Drivers
  • How to Create and Install both the 32bit the
    64bit drivers on the same computer and the
    reasons for doing so.
  • Materialized Views
  • Using Triggers for real time materialized views.
  • Using a cron job for materialized snapshots (with
    table safety in case snapshot generation fails).
  • Shadow Tables, backup tables that show all
    changes (what) and by who and when the took
    place.
  • Microsoft Access
  • Using Microsoft Access as a front end for
    PostgreSQL databases.
  • Implementing PosgreSQL user right within your
    Access Database.
  • How to remove the username and/or password from
    Access so that others cant gain access to your
    account.
  • Limitations of Data compared to PostgreSQL and
    things to watch out for when moving data between
    the systems.
  • Microsoft Excel
  • Using PostgreSQL data in Data Tables and Pivot
    Tables.
  • How to remove the username and/or password from
    Excel so that others cant gain access to your
    account.
  • Writing Data back into PostgreSQL.
  • How to fix all your links in Excel to a
    PostgreSQL server that no longer exists and
    redirect them to a new server.
Write a Comment
User Comments (0)
About PowerShow.com