Bug 208210 - Support Trend Analysis using Linear Regression
Summary: Support Trend Analysis using Linear Regression
Status: NEW
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: BIRT (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P4 enhancement (vote)
Target Milestone: Future   Edit
Assignee: Mingxia Wu CLA
QA Contact:
URL:
Whiteboard:
Keywords: plan
Depends on:
Blocks:
 
Reported: 2007-10-31 07:29 EDT by Peter Robinson CLA
Modified: 2010-03-30 01:39 EDT (History)
3 users (show)

See Also:


Attachments
Chart showing regression lines (74.91 KB, image/jpeg)
2007-10-31 07:29 EDT, Peter Robinson CLA
no flags Details
xml fragment for regression calculation (82.04 KB, application/xml)
2007-11-20 20:45 EST, Peter Robinson CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Robinson CLA 2007-10-31 07:29:49 EDT
Created attachment 81705 [details]
Chart showing regression lines

Provide a way of calculating regression parameters and from that generating regression values in a dataset or a table, and also regression lines on a chart.

Requirement is to take a set of data and calculate the linear regression from that data, giving m and b parameters for the line Y=mX+b that fits the data. The regression result r-squared gives a measure of how well the calculated line fits the actual data.

When calculating the regression, some options are normally useful:
1. use either all points in the data, some selected/sampled set of the data, or the last n points.
2. optionally extending the trend past the end of the given data samples a number of cycles. Eg when looking at Daily samples of data, generate the trend a given number of days into the future.
3. optionally include some percentage growth factor(s) to be applied at given point(s) in the future trend.

For a good example see the Linear regression section here:
http://en.wikipedia.org/wiki/Regression_analysis

See attached chart with two regression lines. One regression based on all data, and the second based on just the last 3 data points. Both lines extended 3 periods past the data end. In both cases r-squared is calculated based on the correlation of the whole set of data to the trend line. This can also be calculated as a correlation between just the selected data and the resultant trend line. Both are useful and perhaps both could be provided.
Comment 1 Peter Robinson CLA 2007-11-20 20:45:57 EST
Created attachment 83382 [details]
xml fragment for regression calculation
Comment 2 Wenfeng Li CLA 2008-12-08 18:49:58 EST
To start research in 2.5.0, but set to future for now until we know we can commit to release in 2.5.