bug found while using caching and its log

A place for users and developers of the Xataface to discuss and receive support.

Re: bug found while using caching and its log

Postby shannah » Mon Jan 11, 2010 11:59 am

Do you mean is doesn't look like it is working on windows? Are there entries in the dataface__mtimes table or is it empty.. If non-empty, what do the entries look like. It should contain table names with unix timestamps.
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Re: bug found while using caching and its log

Postby kevinwen » Tue Jan 12, 2010 11:48 am

Steve,

I found a bug when inserting a new record with cache enabled. Here's how I produce the bug:
1. Delete all caches, if any.
2. Turn on cache.
3. Remember the total number in a table, e.g. 'Found 1190 records' in attachment.
4. Use 'new record' in the attachment to create a new record.
5. The 'Found 1190 records' remains same, but the record just created shows 1191 (process_id) as the primary key.

I think the problem is: the cache is updated only when a select statement on the table is examined, instead of updated on new record inserted. The solution I mentioned above will solve this problem.

Another big problem in your current solution is the cost. You can imagine that every select statement has to introduce another sql statement -- show table status -- as the price of using cache. Each query to database has to be authenticated and checked against privileges, which would take some time for huge volume of select statement. My solution will solve this problem as well. The workaround is to execute a simple query on each table (select 1 from table1, select 1 from table2, etc.) to force the cache to be updated.

There are 2 issuees I'm really concerned about against performance:
1. The use of views in xataface. I saw that there are a lot of views like that: dataface__view_processes_934276e660154d9f61793761035c273a. It would slow down the performance as well, since a result set for the view has to be generated first in order to get another result set from this result set. If cache is disabled or in some situation is not applicable using cache, querying on the view takes more time. Could we create a Materialized View (a term from Oracle and is actually a table storing data) to store result set for those views and then create a database TRIGGER to update these tables?


2. In the drop down list in the attachment, each number contributes a query. Could we use a aggregation table (like Materialized View) to store those number? In my applicaiton, there are 29 queries executed with cache disabled only for the 'list all' of the processes table (This table has 10 filters).

Let me know your thought.

Kevin
Attachments
sample.JPG
sample.JPG (70.38 KiB) Viewed 1936 times
kevinwen
 
Posts: 109
Joined: Mon Dec 28, 2009 3:44 pm

Re: bug found while using caching and its log

Postby shannah » Tue Jan 12, 2010 12:28 pm

I think the problem is: the cache is updated only when a select statement on the table is examined, instead of updated on new record inserted. The solution I mentioned above will solve this problem.

I'll look into this bug. Since the row count is produced by a select query, the cache should be refreshed also. What happens if you hit browser "refresh" on this page. Does the count correct itself? (i wonder if there could be a difference between the mysql time and the server time?)

Another big problem in your current solution is the cost. You can imagine that every select statement has to introduce another sql statement -- show table status -- as the price of using cache. Each query to database has to be authenticated and checked against privileges, which would take some time for huge volume of select statement. My solution will solve this problem as well. The workaround is to execute a simple query on each table (select 1 from table1, select 1 from table2, etc.) to force the cache to be updated.

The cost is an issue either way. The difference is that refreshing the cache at query time allows us to amortize the cost over all of the queries, and the maximum cost is O(1) per query (either it refreshes the cache for that single query or it doesn't).

If we decide to clear the cache on update, we face a large buildup of queries whose cache needs to be cleared. On a busy site it is not out of the question that we would need to delete thousands or hundreds of thousands of cached queries all in one shot.

1. The use of views in xataface. I saw that there are a lot of views like that: dataface__view_processes_934276e660154d9f61793761035c273a. It would slow down the performance as well, since a result set for the view has to be generated first in order to get another result set from this result set. If cache is disabled or in some situation is not applicable using cache, querying on the view takes more time. Could we create a Materialized View (a term from Oracle and is actually a table storing data) to store result set for those views and then create a database TRIGGER to update these tables?


Dataface uses views for something different than the query cache. These particular views are used to cache result sets that you specify in the __sql__ directive of a fields.ini file (i.e. you are overriding the select query for a table). Prior to using a view for this, I used subqueries. The performance between the two strategies is not even close - using a view is much, much faster. (it allows MySQL to use its own internal caching and optimization).

Using a table instead of a view to store these results would create a relatively complex system that would be difficult to debug, as it would create more places where we need to worry about clearing a cache. Using a view, we never have to clear the cache when the database is changed - only when the __sql__ directive's query is changed. We let mysql take care of caching and optimization for this query.

2. In the drop down list in the attachment, each number contributes a query. Could we use a aggregation table (like Materialized View) to store those number? In my applicaiton, there are 29 queries executed with cache disabled only for the 'list all' of the processes table (This table has 10 filters).


Actually each number doesn't contribute a query. There is only one query per filter. The query uses a group by clause to generate these results. Of course as we get down to the nitty-gritty, we always want to ask "is there a way I can improve performance". I fear that the added maintenance of a materialized view would outweigh the performance gain. These filters already benefit from query caching (when enabled) anyways.... and MySQl is uncanny for its ability to use its own internal caching and clever algorithms to churn out these types of queries quickly.

Thank you for taking the time to dissect the innards of Xataface. It can definitely benefit from more eyes.
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Re: bug found while using caching and its log

Postby shannah » Tue Jan 12, 2010 12:31 pm

On a side note, my personal experience with the query caching has been mixed. On databases with very complex __sql__ directives in the fields.ini file, or very complex queries with many joins and subqueries - the query caching offers a noticable performance gain.

Otherwise I have found that the performance gains are not tremendously noticeable.

For sites where there are few updates, but lots of traffic, the output cache is a must to use. It makes a huge difference to server load and site performance. (of course query caching probably helps out server load for busy sites also).
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Re: bug found while using caching and its log

Postby shannah » Tue Jan 12, 2010 12:56 pm

You can imagine that every select statement has to introduce another sql statement -- show table status -- as the price of using cache. ....
Each query to database has to be authenticated and checked against privileges, which would take some time for huge volume of select statement.


Missed this in my first pass. Actually, show table status is called once per request - so even if there are hundreds of SQL queries in a given request, show status is only called once. (Although it is marked to be recalled if any update or insert statements are executed).

Also, I'm not sure what you mean by "Each query to database has to be authenticated and checked against privileges"?
shannah
 
Posts: 4457
Joined: Wed Dec 31, 1969 5:00 pm

Re: bug found while using caching and its log

Postby kevinwen » Tue Jan 12, 2010 1:21 pm

Thank you so much for your response, and I benefit from your comments a lot.

"Each query to database has to be authenticated and checked against privileges"


For each of query, MySQL will examine if the authenticated user has the show privilege to execute this statement. There are some other privileges as well, like SELECT, UPDATE, EXECUTE, GRANT OPTION, etc. If the user doesn't have the privilege to execute whatever statement it is. reference: http://dev.mysql.com/doc/refman/5.1/en/request-access.html

5.4.5. Access Control, Stage 2: Request Verification

After you establish a connection, the server enters Stage 2 of access control. For each request that you issue via that connection, the server determines what operation you want to perform, then checks whether you have sufficient privileges to do so. This is where the privilege columns in the grant tables come into play. These privileges can come from any of the user, db, host, tables_priv, columns_priv, or procs_priv tables. (You may find it helpful to refer to Section 5.4.2, “Privilege System Grant Tables”, which lists the columns present in each of the grant tables.)


The authentication were done and is not an issue when the connection is established. I take it back.
kevinwen
 
Posts: 109
Joined: Mon Dec 28, 2009 3:44 pm

Re: bug found while using caching and its log

Postby kevinwen » Tue Jan 12, 2010 1:29 pm

Your comments:

I'll look into this bug. Since the row count is produced by a select query, the cache should be refreshed also. What happens if you hit browser "refresh" on this page. Does the count correct itself? (i wonder if there could be a difference between the mysql time and the server time?)


Refreshing in browser doesn't help.
kevinwen
 
Posts: 109
Joined: Mon Dec 28, 2009 3:44 pm

Previous

Return to Xataface Users

Who is online

Users browsing this forum: No registered users and 17 guests

cron
Powered by Dataface
© 2005-2007 Steve Hannah All rights reserved