Watch, Follow, &
Connect with Us

For forums, blogs and more please visit our
Developer Tools Community.


Welcome, Guest
Guest Settings
Help

Thread: Need best practices for very large models


This question is not answered. Helpful answers available: 2. Correct answers available: 1.


Permlink Replies: 2 - Last Post: Dec 5, 2014 5:41 AM Last Post By: Thom Pantazi
Patrick Demets

Posts: 12
Registered: 5/22/14
Need best practices for very large models  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 20, 2014 5:42 PM
Hello,

Apologies for the long-winded explanation. Skip to the questions below if you need to.

As a long-time experienced user of Oracle Designer (RIP), I've struggled for several years in my search for an adequate replacement. I haven't found any suitable product, and believe you me I've tried all the modern contenders out there at clients in real-world situations (not a 10-day trial with locked features).

Be that as it may, my current client is using ER/Studio DA XE5, and a recently revived and upgraded repository. Our team has recently grown from one Data Architect, to five, and it would be appropriate and efficient to work collaboratively with common models, rather than sneaker-net separate files of nearly-but-not-quite-the-same data models.

Enough with the boring stuff ... here's the situation:

Our enterprise data model is on the large side of huge (3000-ish entities). Our application systems consolidation has uncovered numerous databases with tables that can be reverse engineered to common entities, but this still means that for a given entity several tables have been implemented in slightly different flavors. This would result in about 6000 tables. For example, we may have only one Vendor entity with a common set of attributes, but several VENDOR tables with variations in table name, columns., keys, etc.

Naturally, a great many entities in the EDM need to refer to other entities, not always in the same Subject Area. For instance, Purchase Order has relationships to Vendor, Product, Customer, all in different Subject Areas. What I've seen done in ER/Studio is to have smaller data model files for each SA, except that in that case I cannot have a relationship from one entity in one model file to another entity in another model file; I need to create copies of entities, and that to me flies in the face of Enterprise Modeling and the minimization of redundancy. As data modelers and architects, aren't we the first ones to say "single source of truth" and "every item stored in its proper place" (i.e., normalization).

The related issue is that each object in the physical data model needs to be able to trace back (reference) to its mapped entity (or entities). Technically, ALL the PDMs do not need to be together, but each PDM must be in the same file as the LDM. Ergo ... all the PDMs should be in the same file as the LDM.

So ... my first question is (finally, you say, he gets to it):

1) Can ER/Studio handle huge model files (remember, we're talking 3000 entities and 6000 tables, obviously with a commensurate quantity of attributes, relationships, etc.)? What are some of the gotchas?

2) Can the repository likewise handle such large model files, and serve them up reasonably quickly, and still be workable when comparing and merging?

I'd appreciate any useful, constructive suggestions and recommendations from experienced ER/Studio users, and especially those who have dealt with a similar situation. Any best practices on collaborating within a medium team would also be appreciated.

Thanks,

Patrick
dean siewert

Posts: 21
Registered: 9/24/10
Re: Need best practices for very large models  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 16, 2014 2:52 PM   in response to: Patrick Demets in response to: Patrick Demets
Patrick Demets wrote:
Hello,

Apologies for the long-winded explanation. Skip to the questions below if you need to.

As a long-time experienced user of Oracle Designer (RIP), I've struggled for several years in my search for an adequate replacement. I haven't found any suitable product, and believe you me I've tried all the modern contenders out there at clients in real-world situations (not a 10-day trial with locked features).

Be that as it may, my current client is using ER/Studio DA XE5, and a recently revived and upgraded repository. Our team has recently grown from one Data Architect, to five, and it would be appropriate and efficient to work collaboratively with common models, rather than sneaker-net separate files of nearly-but-not-quite-the-same data models.

Enough with the boring stuff ... here's the situation:

Our enterprise data model is on the large side of huge (3000-ish entities). Our application systems consolidation has uncovered numerous databases with tables that can be reverse engineered to common entities, but this still means that for a given entity several tables have been implemented in slightly different flavors. This would result in about 6000 tables. For example, we may have only one Vendor entity with a common set of attributes, but several VENDOR tables with variations in table name, columns., keys, etc.

Naturally, a great many entities in the EDM need to refer to other entities, not always in the same Subject Area. For instance, Purchase Order has relationships to Vendor, Product, Customer, all in different Subject Areas. What I've seen done in ER/Studio is to have smaller data model files for each SA, except that in that case I cannot have a relationship from one entity in one model file to another entity in another model file; I need to create copies of entities, and that to me flies in the face of Enterprise Modeling and the minimization of redundancy. As data modelers and architects, aren't we the first ones to say "single source of truth" and "every item stored in its proper place" (i.e., normalization).

The related issue is that each object in the physical data model needs to be able to trace back (reference) to its mapped entity (or entities). Technically, ALL the PDMs do not need to be together, but each PDM must be in the same file as the LDM. Ergo ... all the PDMs should be in the same file as the LDM.

So ... my first question is (finally, you say, he gets to it):

1) Can ER/Studio handle huge model files (remember, we're talking 3000 entities and 6000 tables, obviously with a commensurate quantity of attributes, relationships, etc.)? What are some of the gotchas?

2) Can the repository likewise handle such large model files, and serve them up reasonably quickly, and still be workable when comparing and merging?

I'd appreciate any useful, constructive suggestions and recommendations from experienced ER/Studio users, and especially those who have dealt with a similar situation. Any best practices on collaborating within a medium team would also be appreciated.

Thanks,

Patrick

I haven't gone quite as large as your environment. But it's certainly true that ER/Studio will slow down as it tries to handle more and more data. I believe that it can handle the size you are talking about. What I've seen is that as the model gets large, more time gets consumed updating the internal data structures and the external presentation. If you are manually making the changes, you will probably be slow enough that the computer and software will keep up with you. If you are making changes programatically, take a look at the DiagramManager.EnableScreenUpdateEx() function. It allows you to suspend updates to your windows until you enable screen updates again. I recently had a script that was running over 8 hours until I turned off the screen updates and then it completed in under 10 minutes.

the question about the repository depends a lot on your PC and your repository server. A fast server that's physically close will be best. I'm in a remote office, about 100 miles from the repository server. It might take me 3 minutes to save to the repository while people in the home office can do the same in 20 seconds or less.

A lot of your success will depend on finding a good way to group your data. It's possible to put everything into a 60 meg model, but as you suggest, operations like a merge with that much data will not be fast. If you have subject areas that do not overlap, splitting them into multiple diagrams will help with the overall performance. Along those same lines, if you and your team can get in the habit of checking out only the submodel in question rather than the entire diagram, that should speed integrating changes back into the repository.
Thom Pantazi

Posts: 3
Registered: 12/1/08
Re: Need best practices for very large models  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 5, 2014 5:41 AM   in response to: Patrick Demets in response to: Patrick Demets
Patrick,

I do not believe you can expect ER Studio to handle the volume you are talking about. We have a system that has 11,000 entities and I have never been able to reverse engineer the database. In fact the process dies somewhere around 1,500 entities. ER Studio runs out of memory. My box clearly has enough memory because it doesn't come close to consuming the 16 GB of RAM on my box. Having said that, I do manage to use ER Studio and I do so the same way you eat an elephant. I take on segments as models. All the data in a large database is not connected. In fact, the common story is that there are silos of connected data. Those silos usually have points where they connect to other silos but you can usually manage the process by using a bit of domain management.

Thom
Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02