Statistics data leaves gap if new data matches old


#21

I have no problem running this code:

var a = "1.1";
var b = Convert.ToSingle(a);
Program.Notify("Test Program", a + "<br>" + b);

so Convert.ToSingle() works normally.

But there may be the problem when you try to compare float values, because they may not contain exactly representation of your numbers (see https://stackoverflow.com/questions/7127114/net-floating-point-comparison).
You may use Math.Abs(x1-x2) < diff where diff is something like 0.0001.


#22

Yeah, you’re right. I looked past that first if statement. It adds the value and then looks to see if it’s the same as previous value. If the data is different, it updates the lastOn, lastOff, and lastEvent values. Those should only be updated if it’s different when referring to a light switch. For sensors and whatnot, HG will update the value. It doesn’t matter if those items get updated since they don’t apply.


#23

Does this mean that only the average value for the last 5 minutes is saved in the database? I think that it is pretty bad resolution. What could be the reason for that? Why not save all values?


#24

Yes, only average value for each 5-minute interval is stored in the database. I don’t know what was the reason for this decision. Maybe Gene thought that this would be enough, maybe something else was the reason.
You can lower save interval and then share results :blush:


#25

If you see where it’s selecting 5 minutes, point it out. I see that the database only seems to hold average values in the .db file and was curious why.

Also, I can’t seem to compile using ToSingle(dataVal) without getting an error.

I tried using the convert statement right in the if, but I got the same error. I guess it’s converting to a Boolean for some reason.


#26

Ok, I’m dumb. :wink:

I forgot that module is a container for all the parameters. It does not contain a value itself.


#27

In StatisticsLogger.cs, _statisticsTimeResolutionSeconds is defined as 5 minutes. It triggers a timer that write the value to the database.


#28

I remember seeing that. Couldn’t remember where it was defined. I broke things yesterday somehow such that nothing is being stored in the database. I’ll have to fix that before I can investigate the HG code issues further. I think deleting the database may have been the cause.


#29

Definitely yes.
As I told in my first reply to this topic HG doesn’t have code to restore DB structure after deletion (it was surprising for me). So you should use clean database file from the repository.


#30

Wait, so if I delete the database from within HG it will break the database requiring me to restore from the repository? That seems like a design flaw in the software. I would have expected it to have a backup that it could restore/copy if it wasn’t able to create a new data structure. Strange.

I’ll grab a new copy and keep a backup on there while I’m at it.


#31

Can you delete it from HG? Only clear the database I believe. But I do agree that there should be code to create a new database if it is not found.


#32

So it looks like the reset tool does clean rather than deleting the .db file. I also confirmed that deleting modules via the web interface does not properly clean them out. I had to edit the modules.xml file to properly clean old unwanted modules. The easiest way to do that is to edit the file in Notepad++ (or similar) so that you can collapse the blocks for deletion. Save that file to your pi home directory. Then, copy that file back to /usr/local/bin/homegenie and restart the service.


#33

I think it’s a good idea to fix these two things (deleting modules and recreating database structure) and propose them as a pull-requests to Gene.
Looks like this shouldn’t be too hard to fix.


#34

It appears that the second item (recreating the database) is working correctly. Deleting modules definitely doesn’t not work.I’ll post an issue. I don’t know if there will be any work done in the future by gene, but having it documented there is a good thing either way.

Issue 321 has been created.


#35

Do you really need more granular data? A lot of parameters (like temperature, humidity, etc.) don’t change quickly. Also, batteries-powered sensors limit the reports frequency.

Imagine that you have 20 5-in-1 sensors and you save theirs parameters’ values every minute: this will produce 144 000 records in stats DB.


#36

For most of the times you don’t need more than that. But if you measure power consumption for anything you can loose some interesting information when you only get the avarage for the last 5 minutes.

And a weather station with wind speed would also suffer from this.

Maybe need some kind of behaviour that saves the average but also save “spike” values or something.

I think it should be possible to change for individual modules. Either time based or realtime.


#37

Exactly. I intended to monitor power consumption originally. 5 minute average removes anything interesting from the database. I would have preferred a choice of averaging or current value and a duration option. Since this would generate a lot of data, it might be good to have a second level that reduces resolution to some kind of average after a day/week/month/etc.


#38

Or you could save a average, and also save the specific values that maybe is more/less than 5% of that average, or something like that. So you don’t miss out on the interesting values. Maybe the most extreme top/bottom values should be filtered out before the average is calculated. Don’t know if that is done today?


#39

I would personally say that if the value changed from the previous value then it should be recorded in the db, if your polling values every 15 mins and it doesn’t change then ignore… This maybe then questions what precision is used.


#40

I would agree that saving duplicate values doesn’t need to be done. However, saving the last value before a change would be good. If recording at a given interval and no change iccurs for a large period, having a level line is appropriate. However, when a change does happen, if the previous time doesn’t get recorded, the data will show a slope from the first time that value occured rather than a step from the previous polled time.