Wednesday, April 20, 2016

Overview of "Profile" and "Folder" in WebCenter Content

In this post, I will go over the basics of two features in WebCenter Content (WCC) - Profile and Folder - and then talk about their own design considerations and usages, and finally discuss their differences.

In WCC, the Profile (a.k.a metadata profile or content profile, to differentiate from user profile) is an approach of metadata modeling. The profile is a powerful tool that can be used to manage the metadata fields to achieve efficient processing of content items, such as check-in, update, search, etc.

Essentially, content profiles consist of set of rules that manage the display of the metadata fields of content items. Through content profiles, you can control what metadata fields should be displayed/hidden, required or not, read-only/editable, initialized with default value or not, and grouped based on the action to the content – whether you are viewing the content information, checking in, updating or searching. Profile provides the capability to customize the user interface of the metadata presented to the users. This is important as the profile can not only improve the user experience but also improve the data quality and accuracy. If a user is presented with too many irrelevant fields when dealing with a content item, the experience could be daunting and very likely the user would not provide quality data input. It doesn't need to take long before your content system filled with more and more irrelevant data and is abused in a sense. Bad profiling could definitely hurt searching experience as well. Consider a profile is defined in a way that captures irrelevant and inaccurate data, the overall search output would be misleading.

One of the frustrations I heard from customers was users wanted multiple profiles on a content item. In WCC, only one profile can be associated with a content item. You may find it a drawback from your unique business use cases or the way you handle the content processing. However, it’s designed in this way for a purpose. You can consider the profile is to define the type of a content item. Any content item belongs to a certain type but not more than one types. The capability of aggregating multiple profiles into one content item could very much likely lead to data abuse and subsequently many other undesired outcomes. Another frustration was “specific resources are required to build or update the profiles”. From what we have discussed, a new profile should be created only if there is a new type of content item required in the system. A profile should stay as static as possible and be updated only if the type of the content item changes with your business context. If the business context doesn’t require any new types of content or any existing types to be retired, the profile should not be built or updated frequently. You may need to look into the initial design and definition of the profiles to match the business needs. Since profile is the approach of metadata modeling, the proper metadata design is essential for daily content management. Metadata in WCC should match your enterprise taxonomy to achieve best outcome on content organization.

Speaking of the number of the profiles, there is no good or bad number. It just needs to fit your business needs. The 50-100 range is the average number of profiles for all US WCC implementations (the statistics is not published anywhere but from a technical summit with Oracle WCC team). The extreme case I have seen with a client is over thousands of profiles in the WCC system and it works just fine. There is a performance caveat with very high number of the profiles in the WCC system. I encountered such performance issue in one of my WCC implementations and the issue has been addressed by Oracle team. For details, please check here.

Folder, in WCC, is a way to structure and organize content items. It’s worthy to note that in WCC the folders are standalone “virtual” structures. Content items are not physically stored in any folder. Every content item in a folder has a metadata field (xCollectionId) to store a numeric folder ID that links the content item to a folder. It behaves like a symbolic link in WCC system.

Content folders offer a conventional hierarchy structure that provides easy access to a content item in WCC. They are just like the directories on your local laptop that point to virtual locations of the content system. With folders in WCC, you can just perform actions like you do in the conventional file system. Quoted in Oracle documentation, “The familiar folder and file model provides a framework for organizing and accessing content stored in the repository. Functionally, folders and files are very similar to those in a conventional file system. You can copy, move, rename, and delete folders and files. You can also create shortcuts to folders or files so you can access a content item from multiple locations in the hierarchy. You can think of the files in the Folders interface as symbolic links or pointers to content items in the repository. The operations you perform in the Folders interface, such as searching or propagating metadata, effectively operate on the associated content items.” 

The hierarchical folder interface is achieved by a component installed in WCC. This component is called FrameworkFolders. It is a scalable enterprise solution and is intended to replace the earlier Contribution Folder interface (called Folders_g component). For a comparison of FrameworkFolders and Folders_g, you can visit this link for more details.
  • There are different types of folders can be used to organize content to fit your different needs. Traditional Folders: it’s the general folder we have discussed that you use to organize your content just like the one you use in your computer.
  • Query Folders: it’s a folder you can create based on a search/query result. It contains collections of document based on the search criteria you defined. You can save the query folder just like you create a regular folder.
  • Retention Folder: it’s a type of query folder with retention rules.


Conceptually, the Folder and Profile are distinct on their functionality and their design purpose. Profile can be considered as a way to define a content “type”. Folder, like the conventional folder in your laptop, is a way to aggregate and organize content. You can store content items with the same profile in the same folder or different folders. You can have a folder containing content items with the same profile or different profiles. I will use an example to better illustrate the usage of the profile and folder.  Say in your company you have the following types of content items: legal document, sales document, and reports. You also have the following departments: HR, IT and Sales. All departments may have their own legal documents and reports. Sales document would almost fall into the sales department not the other two. In this case, you may want to take the content types as the profiles and aggregate the content into folders as per the departments. You don’t want to create profile based on departments because the department can have all kinds of content items and it’s not just one static type. If somehow you define the profile as per department, you will find yourself in a way that has to create/update profiles all the time.

Folder and Profile do reveal some similarities in ways that how content is managed. You can manage content items based on a folder or a profile.  For example, you can search content items either by a folder or a profile; you can batch process content items in a folder or a profile, such as manage workflow, govern security, update content information, etc. On the other hand, folder and profile do have many differences from the way they are designed.
  • Folder, just like a file, can have its own metadata. You can also propagate metadata from a folder to the subfolder and the content items within it. But for profile, it is a way to manage metadata and you cannot apply metadata on top of a profile.
  • Folder has its own security. Each folder has an owner who can modify its metadata and delete it if needed. But the folder owner doesn’t have any additional privileges over the content items inside the folder. Profile has little to do with security directly. But since Profile is to manage the metadata, it could manage profile indirectly. The “Security Group” and “Account” metadata can be used to manage security of a content item.
  • With folder, you can perform basic content retention scheduling by creating a retention query folder, assigning retention attributes to the folder, and then configuring the retention schedule. There is also a specific folder type – retention folder in WCC – which is based on query folder with rules for content retention. Since Profile is to manage the metadata, it has little to do with retention directly.
  • In workflow, actions can be applied on top of content items either within a folder or associated with a profile. In this perspective, folder and profile have similar effects.
  • WCC doesn’t have standalone tagging service. But you can create custom metadata for this purpose. Folder has its own metadata, so you can apply the custom tag on top of a folder. Profile, again, as a way to model metadata, can be used to manage any metadata field, including the tagging.
In a quick summary, Profile and Folder are two different concepts in WCC. Although they may reveal some level of similarities in how content can be managed, their design basis are quite distinct. Profile can be considered as a way to define the "type" of a content item and provide a customizable user interface for users to manage their content. Folder provides a virtual hierarchical structure just like the conventional file system in your computer to help to organize and manage content. They should be used and designed as per the essentials of their function, to avoid inefficient content management.

Monday, April 11, 2016

Build A Concurrent Web Page With A Simple Example

Today, by using a simple example, I am going to demonstrate on how to build a concurrent web page in ADF using Jdeveloper.

From time to time, I see requirements from various contexts with such needs to build concurrencies on web pages. The performance can be greatly benefited since multiple processes are executed at the same time so that the response time is greatly improved. It's also very suitable for the use cases that the processes are not associated with any user interface interactions, such as sending an email to user after performing some action but the web page does not need to know if/when the email is sent. Such processes can be executed in a separate thread from the main thread dealing with the web page rendering and other user interactions. Even if you don't have hard requirement on such concurrency needs, why waste it? Nowadays, all servers are built with multi-core computation capabilities and designed for parallel processing. In years the speed of the CPU is not getting much faster but the number of the processing cores are getting more. If you are not taking advantage of it, it would be your waste.

Implementing such concurrency in Java EE application does not seem to be complicated, especially considering the fact that Java concurrency has been introduced since Java 1.5. But, such implementation seems to be somewhat formidable to the development team from my experience. The reason could be without careful handing and design of the Java concurrency, your expected performance gain could lead to an (much worse) opposite degradation. It could also be because there is a lack of examples to demonstrate such use case and that's definitely the purpose of this post to compensate it.

In this example, I have a page built in ADF with 3 sections - left, central and right section. Each section has their independent content and requires different period of time to process. Let's say the left will take 2 seconds, central will take 3 seconds and right will take 5 seconds to return the data.

I will start with the default approach (no concurrency) followed by two others on how to improve our web page response time.

The default

There is no concurrency here. Everything runs sequentially. My page looks like this:

Since everything runs in sequence, so the response time takes 2 (left) + 3 (central) + 5 (right) = 10 seconds minimal. It actually take 10.59 seconds - not a surprise.

If you are wondering what's the code behind the scene on the content rendering, here it is:

Now it's time to explore the better solution.

The Better

In this approach, I would use ExecutorService to mange the creation and termination of the Java threads to perform the parallel processing. Since there is 3 sections in the page that I need to run in parallel, I will create the thread pool with 3 threads. Then I would need to submit individual tasks to the executor service to perform. The task is basically the work unit that needs to be done for each section. Since our task needs return values, I am going to use Callable interface to implement the task. We can get a reference of the task submission which is the Future object. With the Future object, we can get the output of the task when it completes.

CentralContentTask codes -

As you can see, I only need to put the task code into the call() method that I need to implement which is defined by Callable interface.

In the managed bean to render the web page, here is the logic that used to manage threads by ExecutorService - task submission, termination, output retrievemcent, etc.

Please note: it is very important to shut down the ExecutorService after task submission. Otherwise the memory leak would happen. There is no need to wait until the task complete to shut down. You can also shut down the executor service immediately regardless of task complete status by using shutDownNow() but that's not we are going to pursuit.

After, we get the output of the task by get() on the Future object. This method will make the thread to wait until the task complete and return the output.

Let's run the page and see our response time -

It takes 5.37 seconds to load the page. As we are running 3 processes in parallel, the total time will be the one with the longest processing time, which is 5 seconds.

It's much better. But not enough? Can we push all 3 processes to the back without interfering the page rendering?

The Ultimate

The idea is to separate the main thread which is rendering the web page from the parallel processing the 3 sections. It's fairly simple and we actually have already done that in previous code. The only thing blocking the main thread is using the Future.get() to return the output of the task. We will need to avoid that in the main thread. So the question becomes how could we retrieve the output of the task  that takes a period of time and push back to the web page we already rendered?

The answer is the ADF poll component. ADF poll can be used to various use cases that require pushing data to web pages after rendered. Here is how to leverage it in our use case.

First, we add a <af:poll> on the web page with an interval 1 second and makes all 3 sections listen to this component.

Second, we construct the refresh() method defined in the poll component to retrieve the output of the tasks.

Here we use Future.isDone() to check if the task is complete or not before getting the output, so that the thread is not waiting if the task is not complete yet.

Another note is after all 3 processes complete, we will need to stop the poll event. ADF poll offers a property "timeout" to expire the poll after a period of time. But at this time, this feature doesn't work well in ADF 11g including This is a bug already filed but very likely it will NOT be fixed in ADF 11g. To disable the poll, we will use a workaround here to reset the poll interval to "-1" (negative) which will automatically disable the poll.

Let's take a look how long the page takes to render -

Yes, 407ms!

Now all 3 processes are running behind the page and when they are ready, it's pushed to the page individually. If you watch the page, you will see the left content comes first, then the central content and finally the right content.

In most time, the user perception is everything. In our last approach, we are able to give the user the page rendering with no time and gradually add the content when they are ready. It is an ultimate scenario. In cases that the user doesn't want to see the empty content to start with, we can put the process of the desired content back to the main thread with the page rendering, so that the web page always shows the desired content at the user's first sight. In that case the total response time will be a few hundreds of milliseconds plus whatever the process of the desired content will take.

The sample example demoed in the post can be downloaded here. It's built in Jdeveloper