Let's say you're a developer making a package for OpenACS. You've heard statements like, "every package should use the content repository", or maybe a developer has suggested that you use it. Or maybe you just stumbled across it. Why would you want to spend your time reading this document and wasting a good afternoon when you could get started coding right away?
The simple answer is that the content repository (CR) gives you many different things for free:
For the sake of an example, I'm going to use a Tasks application. This application will keep track of all the tasks of an organization, deadlines for those tasks, and who needs to work on them.
The reason I might be interested in using the content repository (CR) in this case is that I can keep track of all the changes to a particular Task, so that if someone changes the deadline on an item, I can see what the original deadline was. In addition, I as a developer would like to have sub-tasks, and sub-sub-tasks, so I can have a whole hierarchy of things that need to be accomplished. Big tasks can be sub-divided so that several people can each do their particular parts.
So I decide to create a Tasks table. Each of these Tasks has various information associated with it, such as deadlines, descriptions of what needs to be accomplished, and so on:
          Task
          Title
          Description
          Task Number
        First of all, let's get some terminology out of the way. Columns of a table are referred to as attributes in content repository-speak.
The steps to set up your data model are as follows:The first step is to decide on what part of a Task you'd you'd like to have under revision control, and what portion you'd like to have just one version of. In our case, the only thing we wouldn't want under version control is the Task Number. This will be a unique identifier for the task, and we don't want that changing every time someone edits it.
For our simple example:
          Title - want versions
          Description - want versions
          Task Number - do NOT want versions
        You will have two tables: one with versioned attributes, and one without versioned attributes.
Convention: often, developers will name the first table by what it is (in my case pm_tasks), and the second, versioned table by the same name, but with _revisions at the end. Thus, I'll name my second table pm_tasks_revisions.
This is actually very easy:Versioned portion:
            create table pm_tasks_revisions (
            task_revision_id
            integer 
            constraint pm_tasks_revisions_id_pk
            primary key
            constraint pm_tasks_revisions_id_fk
            references cr_revisions(revision_id)
            on delete cascade,
            title
            varchar(100),
            description
            varchar(4000)
            );
          Unversioned portion:
            create table pm_tasks (
            task_id
            integer
            constraint pm_tasks_id_pk
            primary key
            constraint pm_tasks_id_fk
            references cr_items(item_id)
            on delete cascade,
            task_number
            integer
            )
          One thing you have to be careful of when creating these
      tables is that there are no columns that have the same names as
      any of the columns in the cr_items and
      cr_revisions tables. For example, you can't call
      you key on the pm_tasks_revisions table
      revision_id. Why? There are some views that are
      automatically generated that combine these tables for you, but
      they won't be created if the names conflict. I'll describe what
      these views are later, but they are useful. You were warned.
Notice that each table uses as its primary key a reference to
      either the cr_revisions table or the
      cr_items table. A content item is basically
      just some content: either text or binary data. The contents
      revisions table keeps track of which version from the
      tasks_revisions table is the most current, and which one is
      live. 
All this is going inside the
      sql/postgresql/project-manager-create.sql
      file. Your name will be different of course.
After we've created the two tables, we need to let the content repository know that we have a new type of structured data that we are storing in the content repository. Tasks are a "content type", because they have data associated with them, such as when they are due, and what needs to be done.
I thus need to to
          --create the content type
          select content_type__create_type (
          'pm_task', -- content_type   
          'content_revision', -- not sure what this is
          'Task', -- pretty_name
          'Tasks', -- pretty_plural
          'pm_tasks_revisions', -- table name
          'task_id', -- id_column
          'content_revision.revision_name'
          );
        You then need to add in all the attributes, so that the content repository can do some magic things behind the scenes. The content repository doesn't know about what's inside of the pm_tasks and pm_tasks_revisions tables, so we teach it:
          -- add in attributes
          
          select content_type__create_attribute (
          'pm_task', -- content_type
          'start_date', -- attribute_name
          'date',     -- datatype (string, number, boolean, date, keyword, integer)
          'Start date', -- pretty_name
          'Start dates', -- pretty_plural
          null, -- sort_order
          null, -- default value
          'timestamptz' -- column_spec
          );
          
          select content_type__create_attribute (
          'pm_task', -- content_type
          'end_date', -- attribute_name
          'date',     -- datatype
          'End date', -- pretty_name
          'End dates', -- pretty_plural
          null, -- sort_order
          null, -- default value
          'timestamptz' -- column_spec
          );
          
          select content_type__create_attribute (
          'pm_task', -- content_type
          'percent_complete', -- attribute_name
          'number',           -- datatype
          'Percent complete', -- pretty_name
          'Percents complete', -- pretty_plural
          null, -- sort_order
          null, -- default value
          'numeric' -- column_spec
          );
        
   
     Side effect: once you've created the content type, the
     content repository creates a view for you called
     pm_tasks_revisionsx. Note the x at the end of the
     name. If you're using Postgres, I believe it will also create a
     view for you called pm_tasks_revisionsi
Why are these two views created? the x view is created for selection, and the i view is created for inserts. They join the acs_objects, cr_revisions, and our pm_tasks_revisions tables together. Try viewing them to get an idea of how they might be useful.
It is also possible to dynamically create tables, and extend
      them with extra columns. You could do this by using create
      table or alter table add column statements
      in SQL, but this also adds in some meta-data that will be useful
      to you. The disadvantage is that you have to call the content
      repository API. The advantage is that someday you'll be able to
      do really cool stuff with it, like automatically generate
      interfaces that take advantage of the new columns and tables
      you've added. Another nice thing is that all that messy business
      of defining your attributes through the API is taken care of.
Types is the content repository are another term for
      tables, although that doesn't explain it completely. Types are
      also kept track of within OpenACS, in the
      acs_object_types table, so the system knows about
      the tables you create, and can do some intelligent things with
      them. 
A lot of the intelligent things you can do with this information is still being built. But imagine for example that you are using the project manager package I've written. You work at an ice cream company, and every task that is done also has an associated ice cream flavor with it (yeah, this isn't a good example, but pay attention anyway). If I've written the project manager to take advantage of it, when you add in this extra attribute to the pm_tasks_revisions table, the UI aspects will be automatically taken care of. You'll be able to select a flavor when you edit a task, and it will be shown on the task view page. This is the direction OpenACS development is going, and it will be really really cool!
First, I'm going to describe how to extend other content repository tables using the CR API. Then, I'll describe how to set up your own tables as well:
As you recall from earlier in this page, attributes are just another term for columns in a table. The Content Repository has a mechanism for adding and removing columns via the pl/sql API. If you check your /api-doc: /api-doc/plsql-subprogram-one?type=FUNCTION&name=content%5ftype%5f%5fcreate%5fattribute
, you'll see that there is a way to extend the columns programmatically.
        
Why would you want to do this? For project manager, I decided to do this because I wanted to customize my local version of the projects table, to account for company-specific information. That way, I can have a separate edit page for those types, but not have a separate table to join against.
. Instead of doing this:
alter table pm_projects add column 
        target_date  date;
select content_type__create_attribute(
        'pm_project',
        'target_date',
        'date',
        'Target date',
        'Target dates',
        null,
        null,
        'date'
);
select content_type__create_attribute(
        'pm_project',
        'company_id',
        'integer',
        'Company',
        'Companies',
        null,
        null,
        'integer constraint pm_project_comp_fk references organizations'
);pm_task__new_task_item function (which we'll write in a little bit). This function creates both a new content item, and a new content revision. Information is actually stored in four tables, believe it or not: cr_revisions, cr_items, pm_tasks, and pm_tasks_revisions.
The task number is stored in pm_tasks, the title and description are stored in pm_tasks_revisions, and some additional information like who entered the information is stored in cr_revisions and cr_items.
Whenever you make a change to this item, you don't change the table yourself, but add a revision, using your pm_task__new_task_revision function (which we'll write in a little bit). This function adds another revision, but not another item or cr_item. After you've added another revision, you'll have two revisions and one item. Two entries in cr_revisions (and pm_tasks_revisions), and one item in cr_items and pm_tasks. 
The cr_revisions table keeps track of which item is the most recent, and which item is "live". For the edit-this-page application, for example, this is used to keep track of which revision to a page is actually being served to users. 
In your code, you'll use your pm_tasks_revisionsx view, which joins the pm_tasks_revisions table with the cr_revisions table (and it might even join in cr_items -- I forget at the moment). 
select define_function_args('pm_task__new_task_item', 'task_id, project_id, title, description, end_date, percent_complete, estimated_hours_work, estimated_hours_work_min, estimated_hours_work_max, creation_date, creation_user, creation_ip, package_id');
create function pm_task__new_task_item (
        integer,        -- task_id
        integer,        -- project_id
        varchar,        -- title
        varchar,        -- description
        timestamptz,    -- end_date
        numeric,        -- percent_complete
        numeric,        -- estimated_hours_work
        numeric,        -- estimated_hours_work_min
        numeric,        -- estimated_hours_work_max,
        timestamptz,    -- creation_date
        integer,        -- creation_user
        varchar,        -- creation_ip
        integer         -- package_id
) returns integer 
as '
declare
        p_task_id                               alias for $1;
        p_project_id                            alias for $2;
        p_title                                 alias for $3;
        p_description                           alias for $4;
        p_end_date                              alias for $5;
        p_percent_complete                      alias for $6;
        p_estimated_hours_work                  alias for $7;
        p_estimated_hours_work_min              alias for $8;
        p_estimated_hours_work_max              alias for $9;
        p_creation_date                         alias for $10;
        p_creation_user                         alias for $11;
        p_creation_ip                           alias for $12;
        p_package_id                            alias for $13;
        v_item_id               cr_items.item_id%TYPE;
        v_revision_id           cr_revisions.revision_id%TYPE;
        v_id                    cr_items.item_id%TYPE;
        v_task_number           integer;
begin
        select acs_object_id_seq.nextval into v_id from dual;
        -- We want to put the task under the project item
        -- create the task_number
        
        v_item_id := content_item__new (
                v_id::varchar,          -- name
                p_project_id,           -- parent_id
                v_id,                   -- item_id
                null,                   -- locale
                now(),                  -- creation_date
                p_creation_user,        -- creation_user
                p_package_id,           -- context_id
                p_creation_ip,          -- creation_ip
                ''content_item'',       -- item_subtype
                ''pm_task'',            -- content_type
                p_title,                -- title
                p_description,          -- description
                ''text/plain'',         -- mime_type
                null,                   -- nls_language
                null                    -- data
        );
        v_revision_id := content_revision__new (
                p_title,                -- title
                p_description,          -- description
                now(),                  -- publish_date
                ''text/plain'',         -- mime_type
                NULL,                   -- nls_language
                NULL,                   -- data
                v_item_id,              -- item_id
                NULL,                   -- revision_id
                now(),                  -- creation_date
                p_creation_user,        -- creation_user
                p_creation_ip           -- creation_ip
        );
        PERFORM content_item__set_live_revision (v_revision_id);
        insert into pm_tasks (
                task_id, task_number)
        values (
                v_item_id, v_task_number);
        insert into pm_tasks_revisions (
                task_revision_id, end_date, percent_complete, estimated_hours_work, estimated_hours_work_min, estimated_hours_work_max, actual_hours_worked)
        values (
                v_revision_id, p_end_date, p_percent_complete, p_estimated_hours_work, p_estimated_hours_work_min, p_estimated_hours_work_max, ''0'');
        PERFORM acs_permission__grant_permission(
                v_revision_id,
                p_creation_user,
                ''admin''
        );
        return v_revision_id;
end;' language 'plpgsql';
select define_function_args('pm_task__new_task_revision', 'task_id, project_id, title, description, end_date, percent_complete, estimated_hours_work, estimated_hours_work_min, estimated_hours_work_max, actual_hours_worked, creation_date, creation_user, creation_ip, package_id');
create function pm_task__new_task_revision (
        integer,        -- task_id (the item_id)
        integer,        -- project_id
        varchar,        -- title
        varchar,        -- description
        timestamptz,    -- end_date
        numeric,        -- percent_complete
        numeric,        -- estimated_hours_work
        numeric,        -- estimated_hours_work_min
        numeric,        -- estimated_hours_work_max
        numeric,        -- actual_hours_worked
        timestamptz,    -- creation_date
        integer,        -- creation_user
        varchar,        -- creation_ip
        integer         -- package_id
) returns integer 
as '
declare
        p_task_id                               alias for $1;
        p_project_id                            alias for $2;
        p_title                                 alias for $3;
        p_description                           alias for $4;
        p_end_date                              alias for $5;
        p_percent_complete                      alias for $6;
        p_estimated_hours_work                  alias for $7;
        p_estimated_hours_work_min              alias for $8;
        p_estimated_hours_work_max              alias for $9;
        p_actual_hours_worked                   alias for $10;
        p_creation_date                         alias for $11;
        p_creation_user                         alias for $12;
        p_creation_ip                           alias for $13;
        p_package_id                            alias for $14;
        v_revision_id           cr_revisions.revision_id%TYPE;
        v_id                    cr_items.item_id%TYPE;
begin
        select acs_object_id_seq.nextval into v_id from dual;
        -- We want to put the task under the project item
        v_revision_id := content_revision__new (
                p_title,                -- title
                p_description,          -- description
                now(),                  -- publish_date
                ''text/plain'',         -- mime_type
                NULL,                   -- nls_language
                NULL,                   -- data
                p_task_id,              -- item_id
                NULL,                   -- revision_id
                now(),                  -- creation_date
                p_creation_user,        -- creation_user
                p_creation_ip           -- creation_ip
        );
        PERFORM content_item__set_live_revision (v_revision_id);
        insert into pm_tasks_revisions (
                task_revision_id, end_date, percent_complete, estimated_hours_work, estimated_hours_work_min, estimated_hours_work_max, actual_hours_worked)
        values (
                v_revision_id, p_end_date, p_percent_complete, p_estimated_hours_work, p_estimated_hours_work_min, p_estimated_hours_work_max, p_actual_hours_worked);
        PERFORM acs_permission__grant_permission(
                v_revision_id,
                p_creation_user,
                ''admin''
        );
        return v_revision_id;
end;' language 'plpgsql';
-- The delete function deletes a record and all related overhead. 
select define_function_args('pm_task__delete_task_item', 'task_id');
create or replace function pm_task__delete_task_item (integer)
returns integer as '
declare
        p_task_id                               alias for $1;
begin
        delete from pm_tasks_revisions
                where task_revision_id in (select revision_id from pm_tasks_revisionsx where item_id = p_task_id);
        delete from pm_tasks
                where task_id = p_task_id;
        raise NOTICE ''Deleting pm_task...'';
        PERFORM content_item__delete(p_task_id);
        return 0;
end;' language 'plpgsql';
item_id - unique id for this item, will be different than the revision_idcr_revisions:
parent_id - used to group items into a hierarchy (see below)
name - this is used to make a URL by the content repository. It must be unique per content folder. You can use a number, or something like project_231. One way to do this is to set it equal to a title plus the item_id.
locale - not sure, probably for internationalization support
live_revision - this is equal to the cr_revision table's revision_id that is the live version
latest_revision - this is equal to the cr_revision table's revision_id that is the latest version
publish_status - not sure
content_type - not sure
storage_type - not sure, probably text or binary?
storage_area_key - not sure
tree_sortkey - a utility column used in hierarchical queries.
revision_id - a unique id for this revision.
item_id - a reference to the item_id for this revision
title - you can use this for your application. For example, My Big Project
description - you can use this for your application, as a longer description.
publish_date - the date this was published. Not sure if this is for your use, or internal
mime_type - the mime type.
nls_language - I believe this is for internationalization
lob - the binary content.
content - the text content.
content_length - the length of the text or binary content?
Using this structure is optional, but useful in many circumstances.
The facility for this is built into the cr_items data model. This makes sense, because you wouldn't want your hierarchy associated with each revision. Here's how Postgres describes the cr_items table:
                         Table "public.cr_items"
      Column      |          Type          |          Modifiers          
------------------+------------------------+-----------------------------
 item_id          | integer                | not null
 parent_id        | integer                | not null
 name             | character varying(400) | not null
 locale           | character varying(4)   | 
 live_revision    | integer                | 
 latest_revision  | integer                | 
 publish_status   | character varying(40)  | 
 content_type     | character varying(100) | 
 storage_type     | character varying(10)  | not null default 'text'
 storage_area_key | character varying(100) | not null default 'CR_FILES'
 tree_sortkey     | bit varying            | 
parent_id refers to either a content item (cr_items), or a subclass of a content_item (such as cr_folders). I'll explain more later about cr_folders.
One thing that you might want to do for your application is to give the application its own root directory. Because the content repository is shared among applications, this separates it off from other applications. They can still use the items in your application, but it must be a more deliberate process. If you don't create your own root directory, you may see strange-looking data from other applications in your application, or see your application's data in other applications. There are times when you'll want to do this, but probably not until you're much more familiar with the content repository. Another reason for creating your own root repository is that you application may be mounted several times. If you want to separate the directory structure between instances of your application, you need to create your own root directory:
-- Creates and returns a unique name for new project folders
select define_function_args('pm_project__new_unique_name', 'package_id');
create function pm_project__new_unique_name (integer)
returns text as '
declare
	p_package_id		alias for $1;
	v_name			cr_items.name%TYPE;
	v_package_key		apm_packages.package_key%TYPE;
	v_id			integer;
begin
	select package_key into v_package_key from apm_packages
	    where package_id = p_package_id;
	select acs_object_id_seq.nextval into v_id from dual;
	-- Set the name
	select v_package_key || ''_'' || 
	    to_char(current_timestamp, ''YYYYMMDD'') || ''_'' ||
	    v_id into v_name;
	return v_name;
end;' language 'plpgsql';
select define_function_args('pm_project__new_root_folder', 'package_id');
create function pm_project__new_root_folder (integer)
returns integer as '
declare
	p_package_id		alias for $1;
        v_folder_id                cr_folders.folder_id%TYPE;
	v_folder_name		cr_items.name%TYPE;
begin
	-- Set the folder name
	v_folder_name := pm_project__new_unique_name (p_package_id);
	v_folder_id := content_folder__new (
	    v_folder_name,			-- name
	    ''Projects'',			-- label
	    ''Project Repository'',		-- description
	    p_package_id				-- parent_id
	);
	-- Register the standard content types
	PERFORM content_folder__register_content_type (
	    v_folder_id,	 -- folder_id
	    ''pm_project'',      -- content_type
	    ''f''		 -- include_subtypes
	);
	-- TODO: Handle Permissions here for this folder.
	return v_folder_id;
end;' language 'plpgsql';
Typically, this definition would go in your sql/postgresql/project-manager-create.sql file. If this file is broken in several parts, this would go in the project-manager-create-functions.sql portion.
Once you've created your root directory, you will set the parent_id of your items to the id for the new root repository (in our case, it's returned from the pm_project__new_root_folder  function)
In the project-manager application, we'll create a root repository, and make all projects under that root repository. That means they'll all have a parent_id set to the root repository. However, we also want to make projects that are sub-projects of other projects. In that case, we will set the parent_id of the sub-project to the item_id of the parent. 
cr_items, and the only real difference is that they contain no data, except for a label and description.
If you create folders for your application, then you'll need to make sure you manage them along with your other objects. For example, if you were to add a folder for each of your objects, then you would probably want to make sure you delete the folder when you delete the object.
However, in many cases you are not creating more than one folder. In fact, the only folder you might have will be the root folder you create for each instance of your application (if you install the project-manager in two parts of your web server, for example, it should have two different root folders). When your application is running, it can determine the root folder by searching the cr_folders table. Here's the definition of that table:
                 Table "public.cr_folders"
       Column       |          Type           |  Modifiers  
--------------------+-------------------------+-------------
 folder_id          | integer                 | not null
 label              | character varying(1000) | 
 description        | text                    | 
 has_child_folders  | boolean                 | default 'f'
 has_child_symlinks | boolean                 | default 'f'
 package_id         | integer                 | 
package_id column. The nice thing about this column is that you can use it to find the root repository, if you only have one folder per instance of your application. You can get your package_id using this call within your .tcl file:
Then you can find the root repository by using a query like this:set package_id [ad_conn package_id]
select folder_id from cr_folders where package_id = :package_id;
| Get latest revision (Tcl) | set live_revision_id [db_exec_plsql get_live_revision "select content_item__get_live_revision(:item_id)"] | 
|---|---|
| Get latest revision (pl/sql) | live_revision_id := content_item__get_live_revision(:item_id); | 
The item_id identifies the content item with which the revision is associated.
Likewise, the most recent revision of a content item can be obtained with the content_item__get_latest_revision function
cr_revisions table, and a table that you set up to store specialized data. In this case, the title (I think), byline, and publication date would be stored in a specialized table.The problem seems to be that there were still items in theReferential Integrity: attempting to delete live_revision: 658
cr_items table. You can remove them using select content_item__delete(648); in psql. You get the codes by doing a query like this: 
Really, however, what you need to do is make sure your __delete and drop scripts first go through and delete all children of those items. I'm not sure if you need to delete the items themselves -- I believe they may be dropped by themselves when the tables are dropped, because of theselect i.item_id, r.revision_id, r.title, i.content_type from cr_items i, cr_revisions r where i.item_id = r.item_id order by i.item_id, r.revision_id;
cascade portion of the SQL data model.
When I was troubleshooting folders, I found this query useful:
select f.folder_id,f.label,f.description,i.content_type from cr_folders f, cr_items i where f.folder_id = i.item_id;
Once again, thanks to daveb for help in tracking this down (he rocks!).