Performance Issues with large dataset

Hi,

I am in the middle of creating a site that is very data heavy. Gravity Forms tables entries currently at 488,656 rows and meta data at 3,927,916 rows - this comprises several different forms.

One form in particular uses Gravity View to display a list of records on a page (total records currently at 355k - obviously not displaying them all at the same time :slight_smile: ).

The page is: http://new.kresy-siberia.org/wall-of-tribute/. It’s really slow to load. I want to add other search fields to this page, but it can’t handle them right now.

This page: http://new.kresy-siberia.org/hall-of-images/ is better, however if I am adding new images I get regular database connection errors and have also crashed the site several times.

I am pretty sure that the huge meta table is the culprit, I just don’t know where to begin with identifying the issue/optimising/fixing problems.

Can anyone give me any suggestions on how to improve performance? We are not finished with adding data and I worry that after all this time developing it, it’s not up to the job!

Many thanks,

Shelley

It sounds like the hardware may not be up to the job (" regular database connection errors and have also crashed the site"). Additionally, with such a large dataset, you may be running into inherent performance limitations in GravityView or Gravity Forms. Have you already contacted GravityView support for assistance with the pages that use their views?

Hi Chris, thanks for your response. So far I have had minimal support from Gravity View, despite suspecting that the cause was in their plugin somewhere. After further investigation I think the issue is the sort order applied to the data. There is a setting in Gravity View that you can set the order of the data. I had it set to asc on Surname and then again on First Name. Seems like Gravity View can’t handle it. When I contacted them about it this was the response I got:

"Hi Shelley,

Sorry, there’s nothing we can suggest to help you improve that on the software end."

Very disappointing response considering I bought their All Access plan at a whopping $349 a year.

I have some other optimisation tasks to complete, but I don’t think they will really make much difference to this page. Especially considering I have another 600k records to add.

:frowning:

Hello! Did you create these data in Gravity Forms or import them from another source? If you imported them, to what extent did you clean up the dataset beforehand? If there was not much cleaning before they were imported, it may be worthwhile to export everything, clean it thoroughly, and then import it freshly.

Hi Shelley. I’m sorry to hear about your interaction with GravityView. I can speak from personal experience that they have been very helpful in the past with suggestions for improving Gravity Forms queries, aiming the help us so that both their product and ours could be better.

I think rather than focusing on possible performance issues with either plugin, I would focus on measuring the actual performance bottlenecks. You said you have some other optimizations to perform. What are the things you are working on?

With such a large dataset, I would focus on MySQL queries. You can enable the slow query log in MySQL. https://dev.mysql.com/doc/refman/5.7/en/slow-query-log.html

I think if you focus on slow queries first, you may be able to identify where the largest performance issues are. There would also be issues with the memory or processor on the web server. Finally, with such a large dataset, you may find that any off-the-shelf software is NOT up to the task, and you will need a bespoke solution. Or at least some customizations to the software you are using.

Let us know what sort of optimizations you are planning on performing.

Hi Chris,

We have now moved the database to a better spec server. This is the info from the server tech:

Previous server - WWWServer(Apache, PHP, SFTP, website files, until recently MySQL server)

  vCPU = 2      
  RAM = 4 GB
  Storage = EBS(close to SSD)

New server - DBServer where currently we store the NEW database, parameters

  vCPU = 2
  RAM = 8 GB
  Storage Type = SSD
  Storage = 5 GB

I have found that the specific cause is the sort and filter option within the view. When I switch them back to default the increase in performance is instantly visible.

I have passed all this information back to the team at Gravity View and I am pleased to say that they are having a look for me. Whether it is something they can help with remains to be seen, but I am grateful that they are taking the time to check it out.

If I have to then go down the route of hiring a developer with better knowledge than me to enhance Gravity View then so be it.

I really do appreciate everyone’s input, I was coming to the end of my knowledge and hitting a brick wall at every turn which stressed me out :frowning:, so thanks for taking the time to give me some pointers. I will check out the slow query log you have sent me the link for.

Hi,

I did import the records, and it took me several days cleaning up the data to ensure it was the right format to be imported and was in touch with plugin developers to check it was okay. There are a couple of fields that need to be addressed, which I am currently working on, but it turns out that the sort and filter options are causing slow load times. If I set them back to default, page load is vastly improved. This gives me hope that I am closer to fixing the issue :slight_smile:

Great! I’m assuming that you already did the usual moves in phpMyAdmin – select the tables, optimize, and repair them? If so, then increasing the RAM allocation may be needed if it’s still sluggish. Good luck!

Hi Shelley. Sounds like you are making progress, which is great.

I work with a client who uses WordPress for a website which has many millions of articles (posts). Here are the actual numbers:


mysql> SELECT COUNT(*) FROM lws_posts;
+----------+
| COUNT(*) |
+----------+
| 12393815 |
+----------+
1 row in set (26.49 sec)

WordPress could not really handle that. Most anything you did on the site would time out due to the sheer volume of posts and associated data. So we ended up making quite a few modifications and added some custom tables (temp and cache and interim tables) in order to deal with a site with that many posts. We added indexes in some cases and had to alter our code in some cases.

The MySQL slow query log was invaluable in tracking down exactly the things that were causing the site to crash. Once we identified the most expensive queries, we worked to fix them one at a time, and then moved on to the next.

I think it’s great that you have gotten this far with your project, and I hope with some additional tweaks that you will be able to create a solution which works smoothly for your needs. Please keep posting and let us know how it goes. I don’t think this is an issue for support, because it does not look like there is anything inherently wrong in any of your plugins, but your large dataset may show places where improvements can be made.

Thanks again.

1 Like

A little more info about our hosting environment. We are using Google Cloud Platform:

Webserver: n1-standard-4, Ubuntu 18.04.4 LTS, 4 vCPUs, 15 GB memory, 250GB storage
Database server: MySQL 5.7, 1 vCPUs, 3.75 GB Memory, 83 GB SSD storage

Hi

I came across this article as I have slow gravity view load times too but my database isn’t anywhere near as large - 1200 records , each about 20 fields.

I emailed gravity view and this was there response at the bottom of this post.

A key thing they mentioned was object caching that some hosting providers offer or if you have dedicated / vps you can install it yourself (redis or memcached).

I wasn’t able to speed up by disabling sorting and i think it’s because I have some custom fields in gravity view - these contain if statements and i think it means the app has to read every single entry in order to work out what to populate this custom field with. I can’t manage without the custom field so I tried upping my shared hosting to the max performance level they offered. It didn’t make much difference at all. So I’m a little stuck. Will prob end up trying this object caching.

Gravity view response:

Hi Adnaan,

The issue here is not really related to the processor or memory of your server, it’s more a matter of database design.

Gravity Forms uses mainly two tables:

  • gf_entry (where it stores the id of the entry and the related form)
  • gf_entrymeta (where it stores each field of each entry of each form as many individual records)

When there are no sorting or search filters, the query to pull the entries just needs to fetch the first few records from the gf_entry table (the IDs for page 1 of results). Then, using only these results, it executes a second query to pull the actual data from gf_entrymeta. This happens quickly because it’s accessing the big meta table just for the specific page 1 IDs it found on the first query.

However, when filtering or sorting, the process is a little different. It can’t limit the query by the pagination. It has to match every single ID of the gf_entry table to the corresponding meta_key on the big meta table right on the first query, so it can sort or filter the IDs by a certain value. You’ll notice that even if you filter or sort directly on your Gravity Forms Dashboard, it will take a long time to do it when you have a lot of records.

Since we use the GFAPI for retrieving entries, our query is generated directly by Gravity Forms and there isn’t much GravityView can do about it, unfortunately.
There’s also nothing conceptually wrong with this approach (of having a meta record for each field of each entry of the form). It just doesn’t scale so well for large Datasets if there isn’t any type of caching configured for it.

The only suggestion we can make is maybe not sorting by more than one field and/or trying a WP managed host, such as WP Engine. This kind of hosting can provide you with professional solutions for performance such as Object Caching and more

Hi !
I have read your articles carefully. I have 6,700,000 rows in my metadata and 60,000 entries. My biggest form has 170 fields (system, presentation and displayed fields). To go further, I load 2 big js files when I display an entry to add some conditions and do some processing on the page but also several json files of Wordpress data, finally I launch other actions that I generate with many add_filter (15 to 25 depending on the forms).
To display the lists of entries in the front-end, I don’t use gravity view which is too slow in sorting/filtering and can’t meet my needs. I need to modify the entries regularly and respect the conditional rules. For this, I use Gravity Sticky List which is quite well optimised. But there are no more updates with sticky-list and I think I will have problems in the future. I am not aware of the latest improvements in gravity view.
To address the various issues you mentioned, I wrote several queries myself to clean up the gravity form tables and remove regularly many unnecessary rows.
For your information, I have a relatively powerful server with vCPU = 4, RAM = 8GB and 100GB of storage.
I don’t know if you have examples of such advanced use of Gravity Forms which is really great and above all how it happens, which are the limits of GF regarding database size etc.
You will understand that I am afraid of the limits I might encounter.
Thanks for your comments and suggestions.