HELP 3
Built-in Filters
All classes and functions listed here are importable from quantopian.pipeline.filters. QTradableStocksUS() A daily universe of Quantopian Tradable US Stocks. Equities are filtered in three passes. Each pass operates only on equities that survived the previous pass. First Pass: Filter based on infrequently-changing attributes using the following rules: The stock must be a common (i.e. not preferred) stock. The stock must not be a depository receipt. The stock must not be for a limited partnership. The stock must not be traded over the counter (OTC). Second Pass: For companies with more than one share class, choose the most liquid share class. Share classes belonging to the same company are indicated by a common primary_share_class_id. Liquidity is measured using the 200-day median daily dollar volume. Equities without a primary_share_class_id are automatically excluded. Third Pass: Filter based on dynamic attributes using the following rules: The stock must have a 200-day median daily dollar volume that exceeds $2.5 Million USD. The stock must have a market capitalization above $350 Million USD over a 20-day simple moving average. The stock must have a valid close price for 180 days out of last 200 days AND have price in each of the last 20 days. The stock must not be an active M&A target. Equities with IsAnnouncedAcquisitionTarget() are screened out. Notes ETFs are not included in this universe. Unlike the Q500US() and Q1500US(), this universe has no size cutoff. All equities that match the required criteria are included. If the most liquid share class of a company passes the static pass but fails the dynamic pass, then no share class for that company is included. class StaticAssets(assets) A Filter that computes True for a specific set of predetermined assets. StaticAssets is mostly useful for debugging or for interactively computing pipeline terms for a fixed set of assets that are known ahead of time. Parameters: class StaticSids(sids) A Filter that computes True for a specific set of predetermined sids. StaticSids is mostly useful for debugging or for interactively computing pipeline terms for a fixed set of sids that are known ahead of time. Parameters: Q500US(minimum_market_cap=500000000) A default universe containing approximately 500 US equities each day. Constituents are chosen at the start of each calendar month by selecting the top 500 "tradeable" stocks by 200-day average dollar volume, capped at 30% of equities allocated to any single sector A stock is considered "tradeable" if it meets the following criteria: The stock must be the primary share class for its company. The company issuing the stock must have known market capitalization. The stock must not be a depository receipt. The stock must not be traded over the counter (OTC). The stock must not be for a limited partnership. The stock must have a known previous-day close price. The stock must have had nonzero volume on the previous trading day. See also quantopian.pipeline.filters.default_us_equity_universe_mask(), quantopian.pipeline.filters.make_us_equity_universe(), quantopian.pipeline.filters.Q1500US(), quantopian.pipeline.filters.Q3000US() Q1500US(minimum_market_cap=500000000) A default universe containing approximately 1500 US equities each day. Constituents are chosen at the start of each month by selecting the top 1500 "tradeable" stocks by 200-day average dollar volume, capped at 30% of equities allocated to any single sector. A stock is considered "tradeable" if it meets the following criteria: The stock must be the primary share class for its company. The company issuing the stock must have known market capitalization. The stock must not be a depository receipt. The stock must not be traded over the counter (OTC). The stock must not be for a limited partnership. The stock must have a known previous-day close price. The stock must have had nonzero volume on the previous trading day. See also quantopian.pipeline.filters.default_us_equity_universe_mask(), quantopian.pipeline.filters.make_us_equity_universe(), quantopian.pipeline.filters.Q500US(), quantopian.pipeline.filters.Q3000US() make_us_equity_universe(target_size, rankby, groupby, max_group_weight, mask, smoothing_func=<function downsample_monthly>) Create a Filter suitable for use as the base universe of a strategy. The resulting Filter accepts approximately the top target_size assets ranked by rankby, subject to tradeability, weighting, and turnover constraints. The selection algorithm implemented by the generated Filter is as follows: Look at all known stocks and eliminate stocks for which mask returns False. Partition the remaining stocks into buckets based on the labels computed by groupby. Choose the top target_size stocks, sorted by rankby, subject to the constraint that the percentage of stocks accepted in any single group in (2) is less than or equal to max_group_weight. Pass the resulting "naive" filter to smoothing_func, which must return a new Filter. Smoothing is most often useful for applying transformations that reduce turnover at the boundary of the universe's rank-inclusion criterion. For example, a smoothing function might require that an asset pass the naive filter for 5 consecutive days before acceptance, reducing the number of assets that too-regularly enter and exit the universe. Another common smoothing technique is to reduce the frequency at which we recalculate using Filter.downsample. The default smoothing behavior is to downsample to monthly frequency. & the result of smoothing with mask, ensuring that smoothing does not re-introduce masked-out assets. Parameters: Example The algorithm for the built-in Q500US universe is defined as follows: At the start of each month, choose the top 500 assets by average dollar volume over the last year, ignoring hard-to-trade assets, and choosing no more than 30% of the assets from any single market sector. The Q500US is implemented as: from quantopian.pipeline import factors, filters, classifiers def Q500US(): return filters.make_us_equity_universe( target_size=500, rankby=factors.AverageDollarVolume(window_length=200), mask=filters.default_us_equity_universe_mask(), groupby=classifiers.fundamentals.Sector(), max_group_weight=0.3, smoothing_func=lambda f: f.downsample('month_start'), ) See also quantopian.pipeline.filters.default_us_equity_universe_mask(), quantopian.pipeline.filters.Q500US(), quantopian.pipeline.filters.Q1500US(), quantopian.pipeline.filters.Q3000US() Returns: default_us_equity_universe_mask(minimum_market_cap=500000000) A function returning the default filter used to eliminate undesirable equities from the QUS universes, e.g. Q500US. The criteria required to pass the resulting filter are as follows: The stock must be the primary share class for its company. The company issuing the stock must have a minimum market capitalization of 'minimum_market_cap', defaulting to 500 Million. The stock must not be a depository receipt. The stock must not be traded over the counter (OTC). The stock must not be for a limited partnership. The stock must have a known previous-day close price. The stock must have had nonzero volume on the previous trading day. Notes We previously had an additional limited partnership check using Fundamentals.limited_partnership, but this provided only false positives beyond those already captured by not_lp_by_name, so it has been removed. See also quantopian.pipeline.filters.Q500US(), quantopian.pipeline.filters.Q1500US(), quantopian.pipeline.filters.Q3000US(), quantopian.pipeline.filters.make_us_equity_universe() All classes and functions listed here are importable from quantopian.pipeline.filters.fundamentals. class IsDepositaryReceipt A Filter indicating whether a given asset is a depositary receipt inputs = (Fundamentals.is_depositary_receipt::bool,) class IsPrimaryShare A Filter indicating whether a given asset class is a primary share. inputs = (Fundamentals.is_primary_share::bool,) is_common_stock() Construct a Filter indicating whether an asset is common (as opposed to preferred) stock.
Built-in Factors
All classes listed here are importable from quantopian.pipeline.factors. class DailyReturns(*args, **kwargs) Calculates daily percent change in close price. Default Inputs: [USEquityPricing.close] class Returns(*args, **kwargs) Calculates the percent change in close price over the given window_length. Default Inputs: [USEquityPricing.close] class VWAP(*args, **kwargs) Volume Weighted Average Price Default Inputs: [USEquityPricing.close, USEquityPricing.volume] Default Window Length: None class AverageDollarVolume(*args, **kwargs) Average Daily Dollar Volume Default Inputs: [USEquityPricing.close, USEquityPricing.volume] Default Window Length: None class AnnualizedVolatility(*args, **kwargs) Volatility. The degree of variation of a series over time as measured by the standard deviation of daily returns.https://en.wikipedia.org/wiki/Volatility_(finance) Default Inputs: zipline.pipeline.factors.Returns(window_length=2) # noqa Parameters: annualization_factor (float, optional) - The number of time units per year. Defaults is 252, the number of NYSE trading days in a normal year. class SimpleBeta(*args, **kwargs) Factor producing the slope of a regression line between each asset's daily returns to the daily returns of a single "target" asset. Parameters: target (zipline.Asset) - Asset against which other assets should be regressed. regression_length (int) - Number of days of daily returns to use for the regression. allowed_missing_percentage (float, optional) - Percentage of returns observations (between 0 and 1) that are allowed to be missing when calculating betas. Assets with more than this percentage of returns observations missing will produce values of NaN. Default behavior is that 25% of inputs can be missing. target Get the target of the beta calculation. class SimpleMovingAverage(*args, **kwargs) Average Value of an arbitrary column Default Inputs: None Default Window Length: None class Latest(*args, **kwargs) Factor producing the most recently-known value of inputs[0] on each day. The .latest attribute of DataSet columns returns an instance of this Factor. class MaxDrawdown(*args, **kwargs) Max Drawdown Default Inputs: None Default Window Length: None class RSI(*args, **kwargs) Relative Strength Index Default Inputs: [USEquityPricing.close] Default Window Length: 15 class ExponentialWeightedMovingAverage(*args, **kwargs) Exponentially Weighted Moving Average Default Inputs: None Default Window Length: None Parameters: inputs (length-1 list/tuple of BoundColumn) - The expression over which to compute the average. window_length (int > 0) - Length of the lookback window over which to compute the average. decay_rate (float, 0 < decay_rate <= 1) - Weighting factor by which to discount past observations. When calculating historical averages, rows are multiplied by the sequence: decay_rate, decay_rate ** 2, decay_rate ** 3, ... Notes This class can also be imported under the name EWMA. See also pandas.ewma() Alternate Constructors: from_span(inputs, window_length, span, **kwargs) Convenience constructor for passing decay_rate in terms of span. Forwards decay_rate as 1 - (2.0 / (1 + span)). This provides the behavior equivalent to passing span to pandas.ewma. Examples # Equivalent to: # my_ewma = EWMA( # inputs=[USEquityPricing.close], # window_length=30, # decay_rate=(1 - (2.0 / (1 + 15.0))), # ) my_ewma = EWMA.from_span( inputs=[USEquityPricing.close], window_length=30, span=15, ) Notes This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev. from_center_of_mass(inputs, window_length, center_of_mass, **kwargs) Convenience constructor for passing decay_rate in terms of center of mass. Forwards decay_rate as 1 - (1 / 1 + center_of_mass). This provides behavior equivalent to passing center_of_mass to pandas.ewma. Examples # Equivalent to: # my_ewma = EWMA( # inputs=[USEquityPricing.close], # window_length=30, # decay_rate=(1 - (1 / 15.0)), # ) my_ewma = EWMA.from_center_of_mass( inputs=[USEquityPricing.close], window_length=30, center_of_mass=15, ) Notes This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev. from_halflife(inputs, window_length, halflife, **kwargs) Convenience constructor for passing decay_rate in terms of half life. Forwards decay_rate as exp(log(.5) / halflife). This provides the behavior equivalent to passing halflife to pandas.ewma. Examples # Equivalent to: # my_ewma = EWMA( # inputs=[USEquityPricing.close], # window_length=30, # decay_rate=np.exp(np.log(0.5) / 15), # ) my_ewma = EWMA.from_halflife( inputs=[USEquityPricing.close], window_length=30, halflife=15, ) Notes This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev. class ExponentialWeightedMovingStdDev(*args, **kwargs) Exponentially Weighted Moving Standard Deviation Default Inputs: None Default Window Length: None Parameters: inputs (length-1 list/tuple of BoundColumn) - The expression over which to compute the average. window_length (int > 0) - Length of the lookback window over which to compute the average. decay_rate (float, 0 < decay_rate <= 1) - Weighting factor by which to discount past observations. When calculating historical averages, rows are multiplied by the sequence: decay_rate, decay_rate ** 2, decay_rate ** 3, ... Notes This class can also be imported under the name EWMSTD. See also pandas.ewmstd() Alternate Constructors: from_span(inputs, window_length, span, **kwargs) Convenience constructor for passing decay_rate in terms of span. Forwards decay_rate as 1 - (2.0 / (1 + span)). This provides the behavior equivalent to passing span to pandas.ewma. Examples # Equivalent to: # my_ewma = EWMA( # inputs=[USEquityPricing.close], # window_length=30, # decay_rate=(1 - (2.0 / (1 + 15.0))), # ) my_ewma = EWMA.from_span( inputs=[USEquityPricing.close], window_length=30, span=15, ) Notes This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev. from_center_of_mass(inputs, window_length, center_of_mass, **kwargs) Convenience constructor for passing decay_rate in terms of center of mass. Forwards decay_rate as 1 - (1 / 1 + center_of_mass). This provides behavior equivalent to passing center_of_mass to pandas.ewma. Examples # Equivalent to: # my_ewma = EWMA( # inputs=[USEquityPricing.close], # window_length=30, # decay_rate=(1 - (1 / 15.0)), # ) my_ewma = EWMA.from_center_of_mass( inputs=[USEquityPricing.close], window_length=30, center_of_mass=15, ) Notes This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev. from_halflife(inputs, window_length, halflife, **kwargs) Convenience constructor for passing decay_rate in terms of half life. Forwards decay_rate as exp(log(.5) / halflife). This provides the behavior equivalent to passing halflife to pandas.ewma. Examples # Equivalent to: # my_ewma = EWMA( # inputs=[USEquityPricing.close], # window_length=30, # decay_rate=np.exp(np.log(0.5) / 15), # ) my_ewma = EWMA.from_halflife( inputs=[USEquityPricing.close], window_length=30, halflife=15, ) Notes This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev. class WeightedAverageValue(*args, **kwargs) Helper for VWAP-like computations. Default Inputs: None Default Window Length: None class BollingerBands(*args, **kwargs) Bollinger Bands technical indicator. https://en.wikipedia.org/wiki/Bollinger_Bands Default Inputs: zipline.pipeline.data.USEquityPricing.close Parameters: inputs (length-1 iterable[BoundColumn]) - The expression over which to compute bollinger bands. window_length (int > 0) - Length of the lookback window over which to compute the bollinger bands. k (float) - The number of standard deviations to add or subtract to create the upper and lower bands. class MovingAverageConvergenceDivergenceSignal(*args, **kwargs) Moving Average Convergence/Divergence (MACD) Signal line https://en.wikipedia.org/wiki/MACD A technical indicator originally developed by Gerald Appel in the late 1970's. MACD shows the relationship between two moving averages and reveals changes in the strength, direction, momentum, and duration of a trend in a stock's price. Default Inputs: zipline.pipeline.data.USEquityPricing.close Parameters: fast_period (int > 0, optional) - The window length for the "fast" EWMA. Default is 12. slow_period (int > 0, > fast_period, optional) - The window length for the "slow" EWMA. Default is 26. signal_period (int > 0, < fast_period, optional) - The window length for the signal line. Default is 9. Notes Unlike most pipeline expressions, this factor does not accept a window_length parameter. window_length is inferred from slow_period and signal_period. class RollingPearsonOfReturns(*args, **kwargs) Calculates the Pearson product-moment correlation coefficient of the returns of the given asset with the returns of all other assets. Pearson correlation is what most people mean when they say "correlation coefficient" or "R-value". Parameters: target (zipline.assets.Asset) - The asset to correlate with all other assets. returns_length (int >= 2) - Length of the lookback window over which to compute returns. Daily returns require a window length of 2. correlation_length (int >= 1) - Length of the lookback window over which to compute each correlation coefficient. mask (zipline.pipeline.Filter, optional) - A Filter describing which assets should have their correlation with the target asset computed each day. Notes Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which correlations are computed. Examples Let the following be example 10-day returns for three different assets: SPY MSFT FB 2017-03-13 -.03 .03 .04 2017-03-14 -.02 -.03 .02 2017-03-15 -.01 .02 .01 2017-03-16 0 -.02 .01 2017-03-17 .01 .04 -.01 2017-03-20 .02 -.03 -.02 2017-03-21 .03 .01 -.02 2017-03-22 .04 -.02 -.02 Suppose we are interested in SPY's rolling returns correlation with each stock from 2017-03-17 to 2017-03-22, using a 5-day look back window (that is, we calculate each correlation coefficient over 5 days of data). We can achieve this by doing: rolling_correlations = RollingPearsonOfReturns( target=sid(8554), returns_length=10, correlation_length=5, ) The result of computing rolling_correlations from 2017-03-17 to 2017-03-22 gives: SPY MSFT FB 2017-03-17 1 .15 -.96 2017-03-20 1 .10 -.96 2017-03-21 1 -.16 -.94 2017-03-22 1 -.16 -.85 Note that the column for SPY is all 1's, as the correlation of any data series with itself is always 1. To understand how each of the other values were calculated, take for example the .15 in MSFT's column. This is the correlation coefficient between SPY's returns looking back from 2017-03-17 (-.03, -.02, -.01, 0, .01) and MSFT's returns (.03, -.03, .02, -.02, .04). See also zipline.pipeline.factors.RollingSpearmanOfReturns, zipline.pipeline.factors.RollingLinearRegressionOfReturns class RollingSpearmanOfReturns(*args, **kwargs) Calculates the Spearman rank correlation coefficient of the returns of the given asset with the returns of all other assets. Parameters: target (zipline.assets.Asset) - The asset to correlate with all other assets. returns_length (int >= 2) - Length of the lookback window over which to compute returns. Daily returns require a window length of 2. correlation_length (int >= 1) - Length of the lookback window over which to compute each correlation coefficient. mask (zipline.pipeline.Filter, optional) - A Filter describing which assets should have their correlation with the target asset computed each day. Notes Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which correlations are computed. See also zipline.pipeline.factors.RollingPearsonOfReturns, zipline.pipeline.factors.RollingLinearRegressionOfReturns class RollingLinearRegressionOfReturns(*args, **kwargs) Perform an ordinary least-squares regression predicting the returns of all other assets on the given asset. Parameters: target (zipline.assets.Asset) - The asset to regress against all other assets. returns_length (int >= 2) - Length of the lookback window over which to compute returns. Daily returns require a window length of 2. regression_length (int >= 1) - Length of the lookback window over which to compute each regression. mask (zipline.pipeline.Filter, optional) - A Filter describing which assets should be regressed against the target asset each day. Notes Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which regressions are computed. This factor is designed to return five outputs: alpha, a factor that computes the intercepts of each regression. beta, a factor that computes the slopes of each regression. r_value, a factor that computes the correlation coefficient of each regression. p_value, a factor that computes, for each regression, the two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero. stderr, a factor that computes the standard error of the estimate of each regression. For more help on factors with multiple outputs, see zipline.pipeline.factors.CustomFactor. Examples Let the following be example 10-day returns for three different assets: SPY MSFT FB 2017-03-13 -.03 .03 .04 2017-03-14 -.02 -.03 .02 2017-03-15 -.01 .02 .01 2017-03-16 0 -.02 .01 2017-03-17 .01 .04 -.01 2017-03-20 .02 -.03 -.02 2017-03-21 .03 .01 -.02 2017-03-22 .04 -.02 -.02 Suppose we are interested in predicting each stock's returns from SPY's over rolling 5-day look back windows. We can compute rolling regression coefficients (alpha and beta) from 2017-03-17 to 2017-03-22 by doing: regression_factor = RollingRegressionOfReturns( target=sid(8554), returns_length=10, regression_length=5, ) alpha = regression_factor.alpha beta = regression_factor.beta The result of computing alpha from 2017-03-17 to 2017-03-22 gives: SPY MSFT FB 2017-03-17 0 .011 .003 2017-03-20 0 -.004 .004 2017-03-21 0 .007 .006 2017-03-22 0 .002 .008 And the result of computing beta from 2017-03-17 to 2017-03-22 gives: SPY MSFT FB 2017-03-17 1 .3 -1.1 2017-03-20 1 .2 -1 2017-03-21 1 -.3 -1 2017-03-22 1 -.3 -.9 Note that SPY's column for alpha is all 0's and for beta is all 1's, as the regression line of SPY with itself is simply the function y = x. To understand how each of the other values were calculated, take for example MSFT's alpha and beta values on 2017-03-17 (.011 and .3, respectively). These values are the result of running a linear regression predicting MSFT's returns from SPY's returns, using values starting at 2017-03-17 and looking back 5 days. That is, the regression was run with x = [-.03, -.02, -.01, 0, .01] and y = [.03, -.03, .02, -.02, .04], and it produced a slope of .3 and an intercept of .011. See also zipline.pipeline.factors.RollingPearsonOfReturns, zipline.pipeline.factors.RollingSpearmanOfReturns All classes and functions listed here are importable from quantopian.pipeline.factors.fundamentals. class MarketCap Factor producing the most recently known market cap for each asset.
Base Classes
Base classes defined in zipline that provide pipeline functionality. NOTE: Pipeline does not currently support futures data. class BoundColumn A column of data that's been concretely bound to a particular dataset. Instances of this class are dynamically created upon access to attributes of DataSets (for example, USEquityPricing.close is an instance of this class). dtype numpy.dtype The dtype of data produced when this column is loaded. latest zipline.pipeline.data.Factor or zipline.pipeline.data.Filter A Filter, Factor, or Classifier computing the most recently known value of this column on each date. Produces a Filter if self.dtype == np.bool_. Produces a Classifier if self.dtype == np.int64 Otherwise produces a Factor. dataset zipline.pipeline.data.DataSet The dataset to which this column is bound. name str The name of this column. metadata dict Extra metadata associated with this column. qualname The fully-qualified name of this column. Generated by doing '.'.join([self.dataset.__name__, self.name]). class Factor Pipeline API expression producing a numerical or date-valued output. Factors are the most commonly-used Pipeline term, representing the result of any computation producing a numerical result. Factors can be combined, both with other Factors and with scalar values, via any of the builtin mathematical operators (+, -, *, etc). This makes it easy to write complex expressions that combine multiple Factors. For example, constructing a Factor that computes the average of two other Factors is simply: >>> f1 = SomeFactor(...) >>> f2 = SomeOtherFactor(...) >>> average = (f1 + f2) / 2.0 Factors can also be converted into zipline.pipeline.Filter objects via comparison operators: (<, <=, !=, eq, >, >=). There are many natural operators defined on Factors besides the basic numerical operators. These include methods identifying missing or extreme-valued outputs (isnull, notnull, isnan, notnan), methods for normalizing outputs (rank, demean, zscore), and methods for constructing Filters based on rank-order properties of results (top, bottom, percentile_between). eq(other) Binary Operator: '==' demean(mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified')) Construct a Factor that computes self and subtracts the mean from row of the result. If mask is supplied, ignore values where mask returns False when computing row means, and output NaN anywhere the mask is False. If groupby is supplied, compute by partitioning each row based on the values produced by groupby, de-meaning the partitioned arrays, and stitching the sub-results back together. Parameters: mask (zipline.pipeline.Filter, optional) - A Filter defining values to ignore when computing means. groupby (zipline.pipeline.Classifier, optional) - A classifier defining partitions over which to compute means. Examples Let f be a Factor which would produce the following output: AAPL MSFT MCD BK 2017-03-13 1.0 2.0 3.0 4.0 2017-03-14 1.5 2.5 3.5 1.0 2017-03-15 2.0 3.0 4.0 1.5 2017-03-16 2.5 3.5 1.0 2.0 Let c be a Classifier producing the following output: AAPL MSFT MCD BK 2017-03-13 1 1 2 2 2017-03-14 1 1 2 2 2017-03-15 1 1 2 2 2017-03-16 1 1 2 2 Let m be a Filter producing the following output: AAPL MSFT MCD BK 2017-03-13 False True True True 2017-03-14 True False True True 2017-03-15 True True False True 2017-03-16 True True True False Then f.demean() will subtract the mean from each row produced by f. AAPL MSFT MCD BK 2017-03-13 -1.500 -0.500 0.500 1.500 2017-03-14 -0.625 0.375 1.375 -1.125 2017-03-15 -0.625 0.375 1.375 -1.125 2017-03-16 0.250 1.250 -1.250 -0.250 f.demean(mask=m) will subtract the mean from each row, but means will be calculated ignoring values on the diagonal, and NaNs will written to the diagonal in the output. Diagonal values are ignored because they are the locations where the mask m produced False. AAPL MSFT MCD BK 2017-03-13 NaN -1.000 0.000 1.000 2017-03-14 -0.500 NaN 1.500 -1.000 2017-03-15 -0.166 0.833 NaN -0.666 2017-03-16 0.166 1.166 -1.333 NaN f.demean(groupby=c) will subtract the group-mean of AAPL/MSFT and MCD/BK from their respective entries. The AAPL/MSFT are grouped together because both assets always produce 1 in the output of the classifier c. Similarly, MCD/BK are grouped together because they always produce 2. AAPL MSFT MCD BK 2017-03-13 -0.500 0.500 -0.500 0.500 2017-03-14 -0.500 0.500 1.250 -1.250 2017-03-15 -0.500 0.500 1.250 -1.250 2017-03-16 -0.500 0.500 -0.500 0.500 f.demean(mask=m, groupby=c) will also subtract the group-mean of AAPL/MSFT and MCD/BK, but means will be calculated ignoring values on the diagonal , and NaNs will be written to the diagonal in the output. AAPL MSFT MCD BK 2017-03-13 NaN 0.000 -0.500 0.500 2017-03-14 0.000 NaN 1.250 -1.250 2017-03-15 -0.500 0.500 NaN 0.000 2017-03-16 -0.500 0.500 0.000 NaN Notes Mean is sensitive to the magnitudes of outliers. When working with factor that can potentially produce large outliers, it is often useful to use the mask parameter to discard values at the extremes of the distribution: >>> base = MyFactor(...) >>> normalized = base.demean( ... mask=base.percentile_between(1, 99), ... ) demean() is only supported on Factors of dtype float64. See also pandas.DataFrame.groupby() zscore(mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified')) Construct a Factor that Z-Scores each day's results. The Z-Score of a row is defined as: (row - row.mean()) / row.stddev() If mask is supplied, ignore values where mask returns False when computing row means and standard deviations, and output NaN anywhere the mask is False. If groupby is supplied, compute by partitioning each row based on the values produced by groupby, z-scoring the partitioned arrays, and stitching the sub-results back together. Parameters: mask (zipline.pipeline.Filter, optional) - A Filter defining values to ignore when Z-Scoring. groupby (zipline.pipeline.Classifier, optional) - A classifier defining partitions over which to compute Z-Scores. Returns: zscored (zipline.pipeline.Factor) - A Factor producing that z-scores the output of self. Notes Mean and standard deviation are sensitive to the magnitudes of outliers. When working with factor that can potentially produce large outliers, it is often useful to use the mask parameter to discard values at the extremes of the distribution: >>> base = MyFactor(...) >>> normalized = base.zscore( ... mask=base.percentile_between(1, 99), ... ) zscore() is only supported on Factors of dtype float64. Examples See demean() for an in-depth example of the semantics for mask and groupby. See also pandas.DataFrame.groupby() rank(method='ordinal', ascending=True, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified')) Construct a new Factor representing the sorted rank of each column within each row. Parameters: method (str, {'ordinal', 'min', 'max', 'dense', 'average'}) - The method used to assign ranks to tied elements. See scipy.stats.rankdata for a full description of the semantics for each ranking method. Default is 'ordinal'. ascending (bool, optional) - Whether to return sorted rank in ascending or descending order. Default is True. mask (zipline.pipeline.Filter, optional) - A Filter representing assets to consider when computing ranks. If mask is supplied, ranks are computed ignoring any asset/date pairs for which mask produces a value of False. groupby (zipline.pipeline.Classifier, optional) - A classifier defining partitions over which to perform ranking. Returns: ranks (zipline.pipeline.factors.Rank) - A new factor that will compute the ranking of the data produced byself. Notes The default value for method is different from the default for scipy.stats.rankdata. See that function's documentation for a full description of the valid inputs to method. Missing or non-existent data on a given day will cause an asset to be given a rank of NaN for that day. See also scipy.stats.rankdata(), zipline.pipeline.factors.factor.Rank pearsonr(target, correlation_length, mask=sentinel('NotSpecified')) Construct a new Factor that computes rolling pearson correlation coefficients between target and the columns of self. This method can only be called on factors which are deemed safe for use as inputs to other factors. This includes Returnsand any factors created from Factor.rank or Factor.zscore. Parameters: target (zipline.pipeline.Term with a numeric dtype) - The term used to compute correlations against each column of data produced by self. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise. correlation_length (int) - Length of the lookback window over which to compute each correlation coefficient. mask (zipline.pipeline.Filter, optional) - A Filter describing which assets should have their correlation with the target slice computed each day. Returns: correlations (zipline.pipeline.factors.RollingPearson) - A new Factor that will compute correlations between target and the columns of self. Examples Suppose we want to create a factor that computes the correlation between AAPL's 10-day returns and the 10-day returns of all other assets, computing each correlation over 30 days. This can be achieved by doing the following: returns = Returns(window_length=10) returns_slice = returns[sid(24)] aapl_correlations = returns.pearsonr( target=returns_slice, correlation_length=30, ) This is equivalent to doing: aapl_correlations = RollingPearsonOfReturns( target=sid(24), returns_length=10, correlation_length=30, ) See also scipy.stats.pearsonr(), zipline.pipeline.factors.RollingPearsonOfReturns, Factor.spearmanr() spearmanr(target, correlation_length, mask=sentinel('NotSpecified')) Construct a new Factor that computes rolling spearman rank correlation coefficients between target and the columns of self. This method can only be called on factors which are deemed safe for use as inputs to other factors. This includes Returnsand any factors created from Factor.rank or Factor.zscore. Parameters: target (zipline.pipeline.Term with a numeric dtype) - The term used to compute correlations against each column of data produced by self. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise. correlation_length (int) - Length of the lookback window over which to compute each correlation coefficient. mask (zipline.pipeline.Filter, optional) - A Filter describing which assets should have their correlation with the target slice computed each day. Returns: correlations (zipline.pipeline.factors.RollingSpearman) - A new Factor that will compute correlations between target and the columns of self. Examples Suppose we want to create a factor that computes the correlation between AAPL's 10-day returns and the 10-day returns of all other assets, computing each correlation over 30 days. This can be achieved by doing the following: returns = Returns(window_length=10) returns_slice = returns[sid(24)] aapl_correlations = returns.spearmanr( target=returns_slice, correlation_length=30, ) This is equivalent to doing: aapl_correlations = RollingSpearmanOfReturns( target=sid(24), returns_length=10, correlation_length=30, ) See also scipy.stats.spearmanr(), zipline.pipeline.factors.RollingSpearmanOfReturns, Factor.pearsonr() linear_regression(target, regression_length, mask=sentinel('NotSpecified')) Construct a new Factor that performs an ordinary least-squares regression predicting the columns of self from target. This method can only be called on factors which are deemed safe for use as inputs to other factors. This includes Returnsand any factors created from Factor.rank or Factor.zscore. Parameters: target (zipline.pipeline.Term with a numeric dtype) - The term to use as the predictor/independent variable in each regression. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, regressions are computed asset-wise. regression_length (int) - Length of the lookback window over which to compute each regression. mask (zipline.pipeline.Filter, optional) - A Filter describing which assets should be regressed with the target slice each day. Returns: regressions (zipline.pipeline.factors.RollingLinearRegression) - A new Factor that will compute linear regressions of target against the columns of self. Examples Suppose we want to create a factor that regresses AAPL's 10-day returns against the 10-day returns of all other assets, computing each regression over 30 days. This can be achieved by doing the following: returns = Returns(window_length=10) returns_slice = returns[sid(24)] aapl_regressions = returns.linear_regression( target=returns_slice, regression_length=30, ) This is equivalent to doing: aapl_regressions = RollingLinearRegressionOfReturns( target=sid(24), returns_length=10, regression_length=30, ) See also scipy.stats.linregress(), zipline.pipeline.factors.RollingLinearRegressionOfReturns quantiles(bins, mask=sentinel('NotSpecified')) Construct a Classifier computing quantiles of the output of self. Every non-NaN data point the output is labelled with an integer value from 0 to (bins - 1). NaNs are labelled with -1. If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations. Parameters: bins (int) - Number of bins labels to compute. mask (zipline.pipeline.Filter, optional) - Mask of values to ignore when computing quantiles. Returns: quantiles (zipline.pipeline.classifiers.Quantiles) - A Classifier producing integer labels ranging from 0 to (bins - 1). quartiles(mask=sentinel('NotSpecified')) Construct a Classifier computing quartiles over the output of self. Every non-NaN data point the output is labelled with a value of either 0, 1, 2, or 3, corresponding to the first, second, third, or fourth quartile over each row. NaN data points are labelled with -1. If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations. Parameters: mask (zipline.pipeline.Filter, optional) - Mask of values to ignore when computing quartiles. Returns: quartiles (zipline.pipeline.classifiers.Quantiles) - A Classifier producing integer labels ranging from 0 to 3. quintiles(mask=sentinel('NotSpecified')) Construct a Classifier computing quintile labels on self. Every non-NaN data point the output is labelled with a value of either 0, 1, 2, or 3, 4, corresonding to quintiles over each row. NaN data points are labelled with -1. If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations. Parameters: mask (zipline.pipeline.Filter, optional) - Mask of values to ignore when computing quintiles. Returns: quintiles (zipline.pipeline.classifiers.Quantiles) - A Classifier producing integer labels ranging from 0 to 4. deciles(mask=sentinel('NotSpecified')) Construct a Classifier computing decile labels on self. Every non-NaN data point the output is labelled with a value from 0 to 9 corresonding to deciles over each row. NaN data points are labelled with -1. If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations. Parameters: mask (zipline.pipeline.Filter, optional) - Mask of values to ignore when computing deciles. Returns: deciles (zipline.pipeline.classifiers.Quantiles) - A Classifier producing integer labels ranging from 0 to 9. top(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified')) Construct a Filter matching the top N asset values of self each day. If groupby is supplied, returns a Filter matching the top N asset values for each group. Parameters: N (int) - Number of assets passing the returned filter each day. mask (zipline.pipeline.Filter, optional) - A Filter representing assets to consider when computing ranks. If mask is supplied, top values are computed ignoring any asset/date pairs for which mask produces a value of False. groupby (zipline.pipeline.Classifier, optional) - A classifier defining partitions over which to perform ranking. Returns: filter (zipline.pipeline.filters.Filter) bottom(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified')) Construct a Filter matching the bottom N asset values of self each day. If groupby is supplied, returns a Filter matching the bottom N asset values for each group. Parameters: N (int) - Number of assets passing the returned filter each day. mask (zipline.pipeline.Filter, optional) - A Filter representing assets to consider when computing ranks. If mask is supplied, bottom values are computed ignoring any asset/date pairs for which mask produces a value of False. groupby (zipline.pipeline.Classifier, optional) - A classifier defining partitions over which to perform ranking. Returns: filter (zipline.pipeline.Filter) percentile_between(min_percentile, max_percentile, mask=sentinel('NotSpecified')) Construct a new Filter representing entries from the output of this Factor that fall within the percentile range defined by min_percentile and max_percentile. Parameters: min_percentile (float [0.0, 100.0]) - Return True for assets falling above this percentile in the data. max_percentile (float [0.0, 100.0]) - Return True for assets falling below this percentile in the data. mask (zipline.pipeline.Filter, optional) - A Filter representing assets to consider when percentile calculating thresholds. If mask is supplied, percentile cutoffs are computed each day using only assets for which mask returns True. Assets for which mask produces False will produce False in the output of this Factor as well. Returns: out (zipline.pipeline.filters.PercentileFilter) - A new filter that will compute the specified percentile-range mask. See also zipline.pipeline.filters.filter.PercentileFilter() isnan() A Filter producing True for all values where this Factor is NaN. Returns: nanfilter (zipline.pipeline.filters.Filter) notnan() A Filter producing True for values where this Factor is not NaN. Returns: nanfilter (zipline.pipeline.filters.Filter) isfinite() A Filter producing True for values where this Factor is anything but NaN, inf, or -inf. Notes: In addition to its named methods, Factor implements the following binary operators producing new factors: +, -, *, /, **, %. Factor also implements the following comparison operators producing filters: <, <=, !=, >=, >. For internal technical reasons, Factor does not override ==. The eq method can be used to produce a Filter that performs a direct equality comparison against the output of a factor. class Filter Pipeline expression computing a boolean output. Filters are most commonly useful for describing sets of assets to include or exclude for some particular purpose. Many Pipeline API functions accept a mask argument, which can be supplied a Filter indicating that only values passing the Filter should be considered when performing the requested computation. For example, zipline.pipeline.Factor.top()accepts a mask indicating that ranks should be computed only on assets that passed the specified Filter. The most common way to construct a Filter is via one of the comparison operators (<, <=, !=, eq, >, >=) of Factor. For example, a natural way to construct a Filter for stocks with a 10-day VWAP less than $20.0 is to first construct a Factor computing 10-day VWAP and compare it to the scalar value 20.0: >>> from zipline.pipeline.factors import VWAP >>> vwap_10 = VWAP(window_length=10) >>> vwaps_under_20 = (vwap_10 <= 20) Filters can also be constructed via comparisons between two Factors. For example, to construct a Filter producing True for asset/date pairs where the asset's 10-day VWAP was greater than it's 30-day VWAP: >>> short_vwap = VWAP(window_length=10) >>> long_vwap = VWAP(window_length=30) >>> higher_short_vwap = (short_vwap > long_vwap) Filters can be combined via the & (and) and | (or) operators. &-ing together two filters produces a new Filter that produces True if both of the inputs produced True. |-ing together two filters produces a new Filter that produces True if either of its inputs produced True. The ~ operator can be used to invert a Filter, swapping all True values with Falses and vice-versa. Filters may be set as the screen attribute of a Pipeline, indicating asset/date pairs for which the filter produces False should be excluded from the Pipeline's output. This is useful both for reducing noise in the output of a Pipeline and for reducing memory consumption of Pipeline results. downsample(frequency) Make a term that computes from self at lower-than-daily frequency. Parameters: frequency ({'year_start', 'quarter_start', 'month_start', 'week_start'}) - A string indicating desired sampling dates: 'year_start' -> first trading day of each year 'quarter_start' -> first trading day of January, April, July, October 'month_start' -> first trading day of each month 'week_start' -> first trading_day of each week class Classifier A Pipeline expression computing a categorical output. Classifiers are most commonly useful for describing grouping keys for complex transformations on Factor outputs. For example, Factor.demean() and Factor.zscore() can be passed a Classifier in their groupby argument, indicating that means/standard deviations should be computed on assets for which the classifier produced the same label. isnull() A Filter producing True for values where this term has missing data. notnull() A Filter producing True for values where this term has complete data. eq(other) Construct a Filter returning True for asset/date pairs where the output of self matches other. startswith(prefix) Construct a Filter matching values starting with prefix. Parameters: prefix (str) - String prefix against which to compare values produced by self. Returns: matches (Filter) - Filter returning True for all sid/date pairs for which self produces a string starting with prefix. endswith(suffix) Construct a Filter matching values ending with suffix. Parameters: suffix (str) - String suffix against which to compare values produced by self. Returns: matches (Filter) - Filter returning True for all sid/date pairs for which self produces a string ending with prefix. has_substring(substring) Construct a Filter matching values containing substring. Parameters: substring (str) - Sub-string against which to compare values produced by self. Returns: matches (Filter) - Filter returning True for all sid/date pairs for which self produces a string containing substring. matches(pattern) Construct a Filter that checks regex matches against pattern. Parameters: pattern (str) - Regex pattern against which to compare values produced by self. Returns: matches (Filter) - Filter returning True for all sid/date pairs for which self produces a string matched by pattern. See also Python Regular Expressions element_of(choices) Construct a Filter indicating whether values are in choices. Parameters: choices (iterable[str or int]) - An iterable of choices. Returns: matches (Filter) - Filter returning True for all sid/date pairs for which self produces an entry in choices. downsample(frequency) Make a term that computes from self at lower-than-daily frequency. Parameters: frequency ({'year_start', 'quarter_start', 'month_start', 'week_start'}) - A string indicating desired sampling dates: 'year_start' -> first trading day of each year 'quarter_start' -> first trading day of January, April, July, October 'month_start' -> first trading day of each month 'week_start' -> first trading_day of each week
Research Pipeline
Clicking on the dataset link on the Self-Serve Data section will take you to a page with a How to Use example that shows you how to setup the full pipeline.
Historical Data Lookup
Continuous futures make it easy to look up the historical price or volume data of a future, without having to think about the individual contracts underneath it. To get the history of a future, you can simply pass a continuous future as the first argument to data.history(). def initialize(context): # S&P 500 E-Mini Continuous Future context.future = continuous_future('ES') schedule_function(daily_func, date_rules.every_day(), time_rules.market_open()) def daily_func(context, data): es_history = data.history(context.future, fields='price', bar_count=5, frequency='1d') Daily frequency history is built on 24 hour trade data. A daily bar for US futures captures the trade activity from 6pm on the previous day to 6pm on the current day (Eastern Time). For example, the Monday daily bar captures trade activity from 6pm the day before (Sunday) to 6pm on the Monday. Tuesday's daily bar will run from 6pm Monday to 6pm Tuesday, etc. Minute data runs 24 hours and is available from 6pm Sunday to 6pm Friday each week. In general, there is a price difference between consecutive contracts for the same underlying asset/commodity at a given point in time. This difference is caused by the opportunity cost and the carry cost of holding the underlying asset for a longer period of time. For example, the storage cost of holding 1000 barrels of oil might push the cost of a contract with delivery in a year to be higher than the cost of a contract with delivery in a month - the cost of storing them for a year would be significant. These price differences tend not to represent differences in the value of the underlying asset or commodity. To factor these costs out of a historical series of pricing data, we make an adjustment on a continuous future. The adjustment removes the difference in cost between consecutive contracts when the continuous future is stitched together. By default, the price history of a continuous future is adjusted. Prices are adjusted backwards from the current simulation date in a backtest. By using adjusted prices, you can make meaningful computations on a continuous price series. The following plot shows the adjusted price of the crude oil continuous future. It also shows the prices of the individual 'active' contracts underneath. The adjusted price factors out the jump between each pair of consecutive contracts. The adjustment method can be specified as an argument to the continuous_future function. This can either be set to adjustment='add' which subtracts the difference in consecutive contract prices from the historical series, or 'mul' which multiplies out the ratio between consecutive contracts. The 'mul' technique is the default used on Quantopian. The continuous_future function also takes other optional arguments. The offset argument allows you to specify whether you want to maintain a reference to the front contract or to a back contract. Setting offset=0 (default) maintains a reference to the front contract, or the contract with the next soonest delivery. Setting offset=1 creates a continuous reference to the contract with the second closest date of delivery, etc. The roll argument allows you to specify the method for determining when the continuous future should start pointing to the next contract. By setting roll='calendar', the continuous future will start pointing to the next contract as the 'active' one when it reaches the auto_close_date of the current contract. By setting roll='volume', the continuous future will start pointing to the next contract as the 'active' one when the volume of the next contract surpasses the volume of the current active contract, as long as it's withing 7 days of the auto_close_date of the current contract. The following code snippet creates a continuous future for the front contract, using the calendar roll method, to be adjusted using the 'mul' technique when making historical data requests. def initialize(context): # S&P 500 E-Mini Continuous Future context.future = continuous_future('ES', offset=0, roll='calendar', adjustment='mul') Continuous futures are explained further in the Futures Tutorial.
Constraints
class Constraint Base class for constraints. class MaxGrossExposure(max) Constraint on the maximum gross exposure for the portfolio. Requires that the sum of the absolute values of the portfolio weights be less than max. Parameters: max (float) - The maximum gross exposure of the portfolio. Examples MaxGrossExposure(1.5) constrains the total value of the portfolios longs and shorts to be no more than 1.5x the current portfolio value. class NetExposure(min, max) Constraint on the net exposure of the portfolio. Requires that the sum of the weights (positive or negative) of all assets in the portfolio fall between min and max. Parameters: min (float) - The minimum net exposure of the portfolio. max (float) - The maximum net exposure of the portfolio. Examples NetExposure(-0.1, 0.1) constrains the difference in value between the portfolio's longs and shorts to be between -10% and 10% of the current portfolio value. class DollarNeutral(tolerance=0.0001) Constraint requiring that long and short exposures must be equal-sized. Requires that the sum of the weights (positive or negative) of all assets in the portfolio fall between +-(tolerance). Parameters: tolerance (float, optional) - Allowed magnitude of net market exposure. Default is 0.0001. class NetGroupExposure(labels, min_weights, max_weights, etf_lookthru=None) Constraint requiring bounded net exposure to groups of assets. Groups are defined by map from (asset -> label). Each unique label generates a constraint specifying that the sum of the weights of assets mapped to that label should fall between a lower and upper bounds. Min/Max group exposures are specified as maps from (label -> float). Examples of common group labels are sector, industry, and country. Parameters: labels (pd.Series[Asset -> object] or dict[Asset -> object]) - Map from asset -> group label. min_weights (pd.Series[object -> float] or dict[object -> float]) - Map from group label to minimum net exposure to assets in that group. max_weights (pd.Series[object -> float] or dict[object -> float]) - Map from group label to maximum net exposure to assets in that group. etf_lookthru (pd.DataFrame, optional) - Indexed by constituent assets x ETFs, expresses the weight of each constituent in each ETF. A DataFrame containing ETF constituents data. Each column of the frame should contain weights (from 0.0 to 1.0) representing the holdings for an ETF. Each row should contain weights for a single stock in each ETF. Columns should sum approximately to 1.0. If supplied, ETF holdings in the current and target portfolio will be decomposed into their constituents before constraints are applied. Examples Suppose we're interested in four stocks: AAPL, MSFT, TSLA, and GM. AAPL and MSFT are in the technology sector. TSLA and GM are in the consumer cyclical sector. We want no more than 50% of our portfolio to be in the technology sector, and we want no more than 25% of our portfolio to be in the consumer cyclical sector. We can construct a constraint expressing the above preferences as follows: # Map from each asset to its sector. labels = {AAPL: 'TECH', MSFT: 'TECH', TSLA: 'CC', GM: 'CC'} # Maps from each sector to its min/max exposure. min_exposures = {'TECH': -0.5, 'CC': -0.25} max_exposures = {'TECH': 0.5, 'CC': 0.5} constraint = NetGroupExposure(labels, min_exposures, max_exposures) Notes For a group that should not have a lower exposure bound, set: min_weights[group_label] = opt.NotConstrained For a group that should not have an upper exposure bound, set: max_weights[group_label] = opt.NotConstrained classmethod with_equal_bounds(labels, min, max, etf_lookthru=None) Special case constructor that applies static lower and upper bounds to all groups. NetGroupExposure.with_equal_bounds(labels, min, max) is equivalent to: NetGroupExposure( labels=labels, min_weights=pd.Series(index=labels.unique(), data=min), max_weights=pd.Series(index=labels.unique(), data=max), ) Parameters: labels (pd.Series[Asset -> object] or dict[Asset -> object]) - Map from asset -> group label. min (float) - Lower bound for exposure to any group. max (float) - Upper bound for exposure to any group. class PositionConcentration(min_weights, max_weights, default_min_weight=0.0, default_max_weight=0.0, etf_lookthru=None) Constraint enforcing minimum/maximum position weights. Parameters: min_weights (pd.Series[Asset -> float] or dict[Asset -> float]) - Map from asset to minimum position weight for that asset. max_weights (pd.Series[Asset -> float] or dict[Asset -> float]) - Map from asset to maximum position weight for that asset. default_min_weight (float, optional) - Value to use as a lower bound for assets not found in min_weights. Default is 0.0. default_max_weight (float, optional) - Value to use as a lower bound for assets not found in max_weights. Default is 0.0. etf_lookthru (pd.DataFrame, optional) - Indexed by constituent assets x ETFs, expresses the weight of each constituent in each ETF. A DataFrame containing ETF constituents data. Each column of the frame should contain weights (from 0.0 to 1.0) representing the holdings for an ETF. Each row should contain weights for a single stock in each ETF. Columns should sum approximately to 1.0. If supplied, ETF holdings in the current and target portfolio will be decomposed into their constituents before constraints are applied. Notes Negative weight values are interpreted as bounds on the magnitude of short positions. A minimum weight of 0.0 constrains an asset to be long-only. A maximum weight of 0.0 constrains an asset to be short-only. A common special case is to create a PositionConcentration constraint that applies a shared lower/upper bound to all assets. An alternate constructor, PositionConcentration.with_equal_bounds(), provides a simpler API supporting this use-case. classmethod with_equal_bounds(min, max, etf_lookthru=None) Special case constructor that applies static lower and upper bounds to all assets. PositionConcentration.with_equal_bounds(min, max) is equivalent to: PositionConcentration(pd.Series(), pd.Series(), min, max) Parameters: min (float) - Minimum position weight for all assets. max (float) - Maximum position weight for all assets. class FactorExposure(loadings, min_exposures, max_exposures) Constraint requiring bounded net exposure to a set of risk factors. Factor loadings are specified as a DataFrame of floats whose columns are factor labels and whose index contains Assets. Minimum and maximum factor exposures are specified as maps from factor label to min/max net exposure. For each column in the loadings frame, we constrain: (new_weights * loadings[column]).sum() >= min_exposure[column] (new_weights * loadings[column]).sum() <= max_exposure[column] Parameters: loadings (pd.DataFrame) - An (assets x labels) frame of weights for each (asset, factor) pair. min_exposures (dict or pd.Series) - Minimum net exposure values for each factor. max_exposures (dict or pd.Series) - Maximum net exposure values for each factor. class Pair(long, short, hedge_ratio=1.0, tolerance=0.0) A constraint representing a pair of inverse-weighted stocks. Parameters: long (Asset) - The asset to long. short (Asset) - The asset to short. hedge_ratio (float, optional) - The ratio between the respective absolute values of the long and short weights. Required to be greater than 0. Default is 1.0, signifying equal weighting. tolerance (float, optional) - The amount by which the hedge ratio of the calculated weights is allowed to differ from the given hedge ratio, in either direction. Required to be greater than or equal to 0. Default is 0.0. class Basket(assets, min_net_exposure, max_net_exposure) Constraint requiring bounded net exposure to a basket of stocks. Parameters: assets (iterable[Asset]) - Assets to be constrained. min_net_exposure (float) - Minimum allowed net exposure to the basket. max_net_exposure (float) - Maximum allowed net exposure to the basket. class Frozen(asset_or_assets, max_error_display=10) Constraint for assets whose positions cannot change. Parameters: asset_or_assets (Asset or sequence[Asset]) - Asset(s) whose weight(s) cannot change. class ReduceOnly(asset_or_assets, max_error_display=10) Constraint for assets whose weights can only move toward zero and cannot cross zero. Parameters: asset (Asset) - The asset whose position weight cannot increase in magnitude. class LongOnly(asset_or_assets, max_error_display=10) Constraint for assets that cannot be held in short positions. Parameters: asset_or_assets (Asset or iterable[Asset]) - The asset(s) that must be long or zero. class ShortOnly(asset_or_assets, max_error_display=10) Constraint for assets that cannot be held in long positions. Parameters: asset_or_assets (Asset or iterable[Asset]) - The asset(s) that must be short or zero. class FixedWeight(asset, weight) A constraint representing an asset whose position weight is fixed as a specified value. Parameters: asset (Asset) - The asset whose weight is fixed. weight (float) - The weight at which asset should be fixed. class CannotHold(asset_or_assets, max_error_display=10) Constraint for assets whose position sizes must be 0. Parameters: asset_or_assets (Asset or iterable[Asset]) - The asset(s) that cannot be held.
Adding Your Dataset
Navigate to the account page data tab and click Add Dataset under the Self-Serve Data section.
Accessing Your Dataset
Private datasets are accessible via pipeline in notebooks and algorithms. Interactive datasets and deltas are also available in research.
Self-Serve Data
Self-Serve Data provides you the ability to upload your own time-series data to Quantopian and access it in research and the IDE directly via Pipeline. You can add historical data as well as live-updating data on a nightly basis.
Important Date Assumptions
When surfacing alternative data in pipeline, Quantopian makes assumptions based on a common data processing workflow for dataset providers. Typically data is analyzed after market close and is stamped as a value as of that day. Since the data was not available for the full trading day, it will only be surfaced in pipeline the next trading day.
Research Interactive
You will also be able to look at the raw upload via the interactive version from quantopian.interactive.data.<UserId> import <dataset> as dataset print len(dataset) dataset.dshape If live loads are configured, you may start to see adjustment records appearing in the deltas table from quantopian.interactive.data.<UserId> import <dataset>_deltas as datasetd datasetd['timestamp'].max()
Pipeline API Core Classes
class Pipeline(columns=None, screen=None) A Pipeline object represents a collection of named expressions to be compiled and executed by a PipelineEngine. A Pipeline has two important attributes: 'columns', a dictionary of named Term instances, and 'screen', a Filter representing criteria for including an asset in the results of a Pipeline. To compute a pipeline in the context of a TradingAlgorithm, users must call attach_pipeline in their initializefunction to register that the pipeline should be computed each trading day. The outputs of a pipeline on a given day can be accessed by calling pipeline_output in handle_data or before_trading_start. Parameters: columns (dict, optional) - Initial columns. screen (zipline.pipeline.term.Filter, optional) - Initial screen. add(term, name, overwrite=False) Add a column. The results of computing term will show up as a column in the DataFrame produced by running this pipeline. Parameters: column (zipline.pipeline.Term) - A Filter, Factor, or Classifier to add to the pipeline. name (str) - Name of the column to add. overwrite (bool) - Whether to overwrite the existing entry if we already have a column named name. remove(name) Remove a column. Parameters: name (str) - The name of the column to remove. Raises: KeyError - If name is not in self.columns. Returns: removed (zipline.pipeline.term.Term) - The removed term. set_screen(screen, overwrite=False) Set a screen on this Pipeline. Parameters: filter (zipline.pipeline.Filter) - The filter to apply as a screen. overwrite (bool) - Whether to overwrite any existing screen. If overwrite is False and self.screen is not None, we raise an error. show_graph(format='svg') Render this Pipeline as a DAG. Parameters: columns The columns registered with this pipeline. screen The screen applied to the rows of this pipeline. class CustomFactor Base class for user-defined Factors. Parameters: Notes Users implementing their own Factors should subclass CustomFactor and implement a method named compute with the following signature: def compute(self, today, assets, out, *inputs): ... On each simulation date, compute will be called with the current date, an array of sids, an output array, and an input array for each expression passed as inputs to the CustomFactor constructor. The specific types of the values passed to compute are as follows: today : np.datetime64[ns] Row label for the last row of all arrays passed as `inputs`. assets : np.array[int64, ndim=1] Column labels for `out` and`inputs`. out : np.array[self.dtype, ndim=1] Output array of the same shape as `assets`. `compute` should write its desired return values into `out`. If multiple outputs are specified, `compute` should write its desired return values into `out.<output_name>` for each output name in `self.outputs`. *inputs : tuple of np.array Raw data arrays corresponding to the values of `self.inputs`. compute functions should expect to be passed NaN values for dates on which no data was available for an asset. This may include dates on which an asset did not yet exist. For example, if a CustomFactor requires 10 rows of close price data, and asset A started trading on Monday June 2nd, 2014, then on Tuesday, June 3rd, 2014, the column of input data for asset A will have 9 leading NaNs for the preceding days on which data was not yet available. Examples A CustomFactor with pre-declared defaults: class TenDayRange(CustomFactor): """ Computes the difference between the highest high in the last 10 days and the lowest low. Pre-declares high and low as default inputs and `window_length` as 10 """ inputs = [USEquityPricing.high, USEquityPricing.low] window_length = 10 def compute(self, today, assets, out, highs, lows): from numpy import nanmin, nanmax highest_highs = nanmax(highs, axis=0) lowest_lows = nanmin(lows, axis=0) out[:] = highest_highs - lowest_lows # Doesn't require passing inputs or window_length because they're # pre-declared as defaults for the TenDayRange class. ten_day_range = TenDayRange() A CustomFactor without defaults: class MedianValue(CustomFactor): """ Computes the median value of an arbitrary single input over an arbitrary window.. Does not declare any defaults, so values for `window_length` and `inputs` must be passed explicitly on every construction. """ def compute(self, today, assets, out, data): from numpy import nanmedian out[:] = data.nanmedian(data, axis=0) # Values for `inputs` and `window_length` must be passed explicitly to # MedianValue. median_close10 = MedianValue([USEquityPricing.close], window_length=10) median_low15 = MedianValue([USEquityPricing.low], window_length=15) A CustomFactor with multiple outputs: class MultipleOutputs(CustomFactor): inputs = [USEquityPricing.close] outputs = ['alpha', 'beta'] window_length = N def compute(self, today, assets, out, close): computed_alpha, computed_beta = some_function(close) out.alpha[:] = computed_alpha out.beta[:] = computed_beta # Each output is returned as its own Factor upon instantiation. alpha, beta = MultipleOutputs() # Equivalently, we can create a single factor instance and access each # output as an attribute of that instance. multiple_outputs = MultipleOutputs() alpha = multiple_outputs.alpha beta = multiple_outputs.beta Note: If a CustomFactor has multiple outputs, all outputs must have the same dtype. For instance, in the example above, if alpha is a float then beta must also be a float.
Optimize API
The Optimize API allows Quantopian users to describe their desired orders in terms of high-level Objectives and Constraints.
Data Methods
The data object passed to handle_data, before_trading_start, and all scheduled methods is the algorithm's gateway to all of Quantopian's minutely pricing data. data.current(assets, fields) Returns the current value of the given assets for the given fields at the current algorithm time. Current values are the as-traded price (except if they have to be forward-filled across an adjustment boundary). Parameters assets: Asset or iterable of Assets. fields: string or iterable of strings. Valid values are 'price', 'last_traded', 'open', 'high', 'low', 'close', 'volume', 'contract' (for ContinuousFutures), or column names in Fetcher files. Returns If a single asset and a single field are passed in, a scalar value is returned. If a single asset and a list of fields are passed in, a pandas Series is returned whose indices are the fields, and whose values are scalar values for this asset for each field. If a list of assets and a single field are passed in, a pandas Series is returned whose indices are the assets, and whose values are scalar values for each asset for the given field. If a list of assets and a list of fields are passed in, a pandas DataFrame is returned, indexed by asset. The columns are the requested fields, filled with the scalar values for each asset for each field. Notes 'price' returns the last known close price of the asset. If there is no last known value (either because the asset has never traded, or because it has delisted) NaN is returned. If a value is found, and we had to cross an adjustment boundary (split, dividend, etc) to get it, the value is adjusted before being returned. 'price' is always forward-filled. 'last_traded' returns the date of the last trade event of the asset, even if the asset has stopped trading. If there is no last known value, pd.NaT is returned. 'volume' returns the trade volume for the current simulation time. If there is no trade this minute, 0 is returned. 'open', 'high', 'low', and 'close' return the relevant information for the current trade bar. If there is no current trade bar, NaN is returned. These fields are never forward-filled. 'contract' returns the current active contract of a continuous future. This field is never forward-filled. data.history(assets, fields, bar_count, frequency) Returns a window of data for the given assets and fields. This data is adjusted for splits, dividends, and mergers as of the current algorithm time. The semantics of missing data are identical to the ones described in the notes for data.current. Parameters assets: Asset or iterable of Assets. fields: string or iterable of strings. Valid values are 'price', 'open', 'high', 'low', 'close', 'volume', or column names in Fetcher files. bar_count: integer number of bars of trade data. frequency: string. '1m' for minutely data or '1d' for daily date Returns Series or DataFrame or Panel, depending on the dimensionality of the 'assets' and 'fields' parameters. If single asset and field are passed in, the returned Series is indexed by date. If multiple assets and single field are passed in, the returned DataFrame is indexed by date, and has assets as columns. If a single asset and multiple fields are passed in, the returned DataFrame is indexed by date, and has fields as columns. If multiple assets and multiple fields are passed in, the returned Panel is indexed by field, has dt as the major axis, and assets as the minor axis. Notes These notes are identical to the information for data.current. 'price' returns the last known close price of the asset. If there is no last known value (either because the asset has never traded, or because it has delisted) NaN is returned. If a value is found, and we had to cross an adjustment boundary (split, dividend, etc) to get it, the value is adjusted before being returned. 'price' is always forward-filled. 'volume' returns the trade volume for the current simulation time. If there is no trade this minute, 0 is returned. 'open', 'high', 'low', and 'close' return the relevant information for the current trade bar. If there is no current trade bar, NaN is returned. These fields are never forward-filled. When requesting minute-frequency historical data for futures in backtesting, you have access to data outside the 6:30AM-5PM trading window. If you request the last 60 minutes of pricing data for a Future or a ContinuousFutureon the US Futures calendar at 7:00AM, you will get the pricing data from 6:00AM-7:00AM. Similarly, daily history for futures captures trade activity from 6pm-6pm ET (24 hours). For example, the Monday daily bar captures trade activity from 6pm the day before (Sunday) to 6pm on the Monday. Tuesday's daily bar will run from 6pm Monday to 6pm Tuesday, etc. If you request a historical window of data for equities that extends earlier than equity market open on the US Futures calendar, the data returned will include NaN (open/high/low/close), 0 (volume), or a forward-filled value from the end of the previous day (price) as we do not have data outside of market hours. Note that if you request for a window of minute level data for equities that extends earlier than market open on the US Equities calendar, your window will include data from the end of the previous day instead. data.can_trade(assets) For the given asset or iterable of assets, returns true if the security has a known last price, is currently listed on a supported exchange, and is not currently restricted. For continuous futures, can_trade returns true if the simulation date is between the start_date and auto_close_date. Parameters assets: Asset or iterable of Assets. Returns boolean or Series of booleans, indexed by asset. data.is_stale(assets) For the given asset or iterable of assets, returns true if the asset has ever traded and there is no trade data for the current simulation time. If the asset has never traded, returns False. If the current simulation time is not a valid market time, we use the current time to check if the asset is alive, but we use the last market minute/day for the trade data check. Parameters assets: Asset or iterable of Assets. Returns boolean or Series of booleans, indexed by asset. data.current_chain(continuous_future) Gets the current forward-looking chain of all contracts for the given continuous future which have begun trading at the simulation time. Parameters continuous_future: A ContinuousFuture object. Returns A list[FutureChain] containing the future chain (in order of delivery) for the specified continuous future at the current simulation time. data.fetcher_assets Returns a list of assets from the algorithm's Fetcher file that are active for the current simulation time. This lets your algorithm know what assets are available in the Fetcher file. Returns A list of assets from the algorithm's Fetcher file that are active for the current simulation time.
KEYBOARD SHORTCUT
.c or .continue Resume backtest until it triggers the next breakpoint or an exception is raised. .n or .next Executes the next line of code. This steps over the next expression. .s or .step Executes the next line of code. This steps into the next expression, following function calls. .r or .return Execute until you are about to return from the current function call. .f or .finish Disables all breakpoints and finishes the backtest. .clear Clears the data in the debugger window. .h or .help Shows the command shortcuts for the debugger.
Built-in Classifiers
All classes listed here are importable from quantopian.pipeline.classifiers.fundamentals. class SuperSector Classifier that groups assets by Morningstar Super Sector. There are three possible classifications: 1 - Cyclical 2 - Defensive 3 - Sensitive These values are provided as integer constants on the class. For more information on morningstar classification codes, see: https://www.quantopian.com/help/fundamentals#industry-sector. CYCLICAL = 1 DEFENSIVE = 2 SENSITIVE = 3 SUPER_SECTOR_NAMES = {1: 'CYCLICAL', 2: 'DEFENSIVE', 3: 'SENSITIVE'} dtype = dtype('int64') inputs = (Fundamentals.morningstar_economy_sphere_code::int64,) missing_value = -1 class Sector Classifier that groups assets by Morningstar Sector Code. There are 11 possible classifications: 101 - Basic Materials 102 - Consumer Cyclical 103 - Financial Services 104 - Real Estate 205 - Consumer Defensive 206 - Healthcare 207 - Utilities 308 - Communication Services 309 - Energy 310 - Industrials 311 - Technology These values are provided as integer constants on the class. For more information on morningstar classification codes, see: https://www.quantopian.com/help/fundamentals#industry-sector. BASIC_MATERIALS = 101 COMMUNICATION_SERVICES = 308 CONSUMER_CYCLICAL = 102 CONSUMER_DEFENSIVE = 205 ENERGY = 309 FINANCIAL_SERVICES = 103 HEALTHCARE = 206 INDUSTRIALS = 310 REAL_ESTATE = 104 SECTOR_NAMES = {205: 'CONSUMER_DEFENSIVE', 206: 'HEALTHCARE', 207: 'UTILITIES', 101: 'BASIC_MATERIALS', 102: 'CONSUMER_CYCLICAL', 103: 'FINANCIAL_SERVICES', 104: 'REAL_ESTATE', 308: 'COMMUNICATION_SERVICES', 309: 'ENERGY', 310: 'INDUSTRIALS', 311: 'TECHNOLOGY'} TECHNOLOGY = 311 UTILITIES = 207 dtype = dtype('int64') inputs = (Fundamentals.morningstar_sector_code::int64,) missing_value = -1
Available Futures
Below is a list of futures that are currently available on Quantopian. Note that some of these futures may no longer be trading but are included in research and backtesting to avoid forward lookahead bias. The data for some futures in the Quantopian database starts after 2002. This visualization shows the start and end of the data we have for each future. Name Root Symbol Exchange Soybean Oil BO CBOT Corn E-Mini CM CBOT Corn CN CBOT Ethanol ET CBOT 30-Day Federal Funds FF CBOT 5-Year Deliverable Interest Rate Swap Futures FI CBOT TNote 5 yr FV CBOT Soybeans E-Mini MS CBOT Wheat E-Mini MW CBOT Oats OA CBOT Rough Rice RR CBOT Soybean Meal SM CBOT Soybeans SY CBOT 10-Year Deliverable Interest Rate Swap Futures TN CBOT TNote 2 yr TU CBOT TNote 10 yr TY CBOT Ultra Tbond UB CBOT TBond 30 yr US CBOT Wheat WC CBOT Dow Jones E-mini YM CBOT Big Dow BD CBOT (no longer trading) DJIA Futures DJ CBOT (no longer trading) 5-Year Interest Rate Swap Futures FS CBOT (no longer trading) Municipal Bonds MB CBOT (no longer trading) 10-Year Interest Rate Swap Futures TS CBOT (no longer trading) VIX Futures VX CFE Australian Dollar AD CME Bloomberg Commodity Index Futures AI CME British Pound BP CME Canadian Dollar CD CME Euro FX EC CME Eurodollar ED CME Euro FX E-mini EE CME S&P 500 E-Mini ES CME E-micro EUR/USD Futures EU CME Feeder Cattle FC CME Japanese Yen E-mini JE CME Japanese Yen JY CME Lumber LB CME Live Cattle LC CME Lean Hogs LH CME Mexican Peso ME CME S&P 400 MidCap E-Mini MI CME Nikkei 225 Futures NK CME NASDAQ 100 E-Mini NQ CME New Zealand Dollar NZ CME Swiss Franc SF CME S&P 500 Futures SP CME S&P 400 MidCap Futures MD CME (no longer trading) NASDAQ 100 Futures ND CME (no longer trading) Pork Bellies PB CME (no longer trading) TBills TB CME (no longer trading) Gold GC COMEX Copper High Grade HG COMEX Silver SV COMEX NYSE Comp Futures YX NYFE Light Sweet Crude Oil CL NYMEX NY Harbor USLD Futures (Heating Oil) HO NYMEX Natural Gas NG NYMEX Palladium PA NYMEX Platinum PL NYMEX Natural Gas E-mini QG NYMEX Crude Oil E-Mini QM NYMEX RBOB Gasoline Futures XB NYMEX Unleaded Gasoline HU NYMEX (no longer trading) MSCI Emerging Markets Mini EI ICE MSCI EAFE Mini MG ICE Gold mini-sized XG ICE Silver mini-sized YS ICE Sugar #11 SB ICE Russell 1000 Mini RM ICE Russell 2000 Mini ER ICE Eurodollar EL ICE (no longer trading)
Two Data Column CSV Example
Below is a very simple example of a .csv file that can be uploaded to Quantopian. The first row of the file must be a header, and columns are separated by the comma character (,). date,symbol,signal1,signal2 2014-01-01,TIF,1204.5,0 2014-01-02,TIF,1225,0.5 2014-01-03,TIF,1234.5,0 2014-01-06,TIF,1246.3,0.5 2014-01-07,TIF,1227.5,0
Data Columns
Data should be in csv format (comma-separated value) with a minimum of three columns: one primary date, one primary assetand one or more value columns. Note: Column Names must be unique and contain only alpha-numeric characters and underscores. Primary Date: A column must be selected as the primary date - A date column to which the record's values apply, generally it's the day on which you learned about the information. You may also know this as the primary date or the asof_date for other items in the Quantopian universe. Data before 2002 will be ignored. Blank or NaN/NaT values will cause errors Example Date Formats: 2018-01-01, 20180101, 01/01/2018, 1/1/2018. Note: datetime asof_date values are not supported in our existing pipeline extraction, please contact us at [email protected] to discuss your use case. Primary Asset: A column must be selected as the primary asset. This symbol column is used to identify assets on that date. If you have multiple share classes of an asset, those should be identified on the symbol with a delimiter (-,/,.,_). ex: "BRK_A" or "BRK/A" Note: Blank values will cause errors. Records that cannot be mapped from symbols to assets in the Quantopian US equities database will be skipped Value(s) (1+): One or more value columns provided by the user for use in their particular strategy. The example .csv file earlier in this document has 2 value columns. During the historical upload process, you will need to map your data columns into one of the acceptable data formats below: Numeric- Values. Note that these will be converted to floats to maximize precision in future calculation. String- Textual values which do not need further conversion. Often titles, or categories. Date- This will not be used as the primary date column, but can still be examined within the algorithm or notebook. Date type columns are not adjusted during Timezone adjustment. Example Date Formats: 2018-01-01, 20180101, 01/01/2018, 1/1/2018 Datetime- Date with a timestamp (in UTC), values will be adjusted when the incoming Timezone is configured to a value other than UTC. A general datetime field that can be examined within the algorithm or notebook. Example DateTime Formats: [Sample Date] 1:23 or [Sample Date] 20:23-05:00 (with timezone designator) Bool- True or False values (0,'f','F','false','False','FALSE' and 1,'t','T','true','True','TRUE') Reserved Column Names: timestamp and sid will be added by Quantopian during data processing, you cannot use these column names in your source data. If you have a source column named symbol, it must be set as the Primary Asset. NaN values: By default the following values will be interpreted as NaN: '', '#N/A', '#NA', 'N/A', '-NaN', 'NULL', 'NaN', 'null' Infinite values: By default 'inf' or '-inf' values will be interpreted as positive and negative infinite values.
Name Your Dataset
Dataset Name: Used as namespace for importing your dataset. The name must start with a letter and must contain only lowercase alpha-numeric and underscore characters. Upload and Map Historical Data Click or drag a file to upload into the historical data drag target. Clicking on the button will bring up your computer's file browser. Once a file has been successfully loaded in the preview pane, you must define one Primary Date and one Primary Asset column and map all of the data columns to specific types. Configure Historical Lag Historical timestamps are generated by adding the historical lag to the primary date. The default historical lag is one day, which will prevent end of day date type values from appearing in pipeline a day early. ex: 2018-03-02 with a one hour lag would create a historical timestamp of 2018-03-02 01:00:00, which would incorrectly appear in pipeline on 2018-03-02. The default one day historical lag adjusts them to appear correctly in pipeline on 2018-03-03. Configure Timezone By default, datetime values are expected in UTC or should include a timezone designator. If another timezone is selected (ex: US/Eastern), the incoming datetime type columns will be converted to UTC when a timezone designator is not provided. ex: Note: date type columns will not be timezone adjusted. Be careful when determining the timezone of a new datasource with datetime values that don't have the timezone designator. For example, GOOGLEFINANCE in Google Sheets returns dates like YYYY-MM-DD 16:00:00, which are end-of-day values for the market in East Coast time. Selecting US/Eastern will properly convert the datetime to YYYY-MM-DD 20:00:00 during daylight savings time and YYYY-MM-DD 21:00:00 otherwise. Selecting Next will complete the upload and start the data validation process.
Preparing your historical data:
During the historical upload, Quantopian will translate the primary date column values into a historical timestamp by adding a historical lag (default: one day). In the example above, the date column was selected as the primary date and adjusted to create the historical timestamps. Corner cases to think about: Avoid lookahead data: The primary date column should not include future dated historical values, we will ignore them. A similar requirement exists for live data, the asof_date must not be greater than the timestamp. Note: These will cause issues with pipeline. Trade Date Signals: If your date field represents the day you expect an algorithm to act on the signal, you should create a trade_date_minus_one column that can be used as the primary date column.
Risk Model (Experimental)
Functions and classes listed here provide access to the outputs of the Quantopian Risk Model via the Pipeline API. They are currently importable from quantopian.pipeline.experimental. We expect to eventually stabilize and move these features to quantopian.pipeline. risk_loading_pipeline() Create a pipeline with all risk loadings for the Quantopian Risk Model. Returns: pipeline (quantopian.pipeline.Pipeline) - A Pipeline containing risk loadings for each factor in the Quantopian Risk Model.
Module Import
Only specific, whitelisted portions of Python modules can be imported. Select portions of the following libraries are allowed. If you need a module that isn't on this list, please let us know. bisect blaze cmath collections copy cvxopt cvxpy datetime functools heapq itertools math numpy odo operator pandas pykalman pytz quantopian Queue re scipy sklearn sqlalchemy statsmodels talib time zipline zlib
Continuous Futures
Futures contracts have short lifespans. They traded for a specific period of time before they expire at a pre-determined date of delivery. In order to hold a position over many days in an underlying asset, you might need to close your position in an expiring contract and open a position in the next contract. Similarly, when looking back at historical data, you need to get pricing or volume data from multiple contracts to get a series of data that reflects the underlying asset. Both of these problems can be solved by using continuous futures. Continuous futures can be created with the continuous_future function. Using the continuous_future function brings up a search box for you to look up the root symbol of a particular future. Continuous futures are abstractions over the 'underlying' commodities/assets/indexes of futures. For example, if we wanted to trade crude oil, we could create a reference to CL, instead of a list of CLF16, CLG16, CLH16, CLJ16, etc.. Instead of trying to figure out which contract in the list we want to trade on a particular date, we can use the continuous future to get the current active contract. We can do this by using using data.current() as shown in this example: def initialize(context): # S&P 500 E-Mini Continuous Future context.future = continuous_future('ES') schedule_function(daily_func, date_rules.every_day(), time_rules.market_open()) def daily_func(context, data): es_active = data.current(context.future, 'contract') It is important to understand that continuous futures are not tradable assets. They simply maintain a reference to the active contracts of an underlying asset, so we need to get the underlying contract before placing an order. Continuous futures allow you to maintain a continuous reference to an underlying asset. This is done by stitching together consecutive contracts. This plot shows the volume history of the crude oil continuous future as well as the volume of the individual crude oil contracts. As you can see, the volume of the continuous future is the skyline of the contracts that make it up.
Research Notebooks
If you are a loading a new dataset into a notebook you will need to Restart to be able to access the new import. The dataset can be imported before all the data is fully available. You can monitor the status by checking the load_metrics data in research. This will also help you identify any rows that were skipped during the upload (trim before 2002-01-01) and symbol mapping process (rows_received - total_rows). from odo import odo from quantopian.interactive.data.user_<UserId> import load_metrics lm = odo(load_metrics[['timestamp','dataset','status','rows_received','total_rows' ,'rows_added','delta_rows_added','last_updated','time_elapsed' ,'filenames_downloaded','source_last_updated','bytes_downloaded' ,'db_table_size','error' ]].sort('timestamp',ascending=False), pd.DataFrame) lm
Importing Files from Google Drive
If you don't have a public Dropbox folder, you can also import a CSV from your Google Drive. To get the public URL for your file: Click on File > Publish to the web. Change the 'web page' option to 'comma-separated values (.csv)'. Click the Publish button. Copy and paste the URL into your Fetcher query.
Data Manipulation with Fetcher
If you produce the CSV, it is relatively easy to put the data into a good format for Fetcher. First decide if your file should be a signal or security info source, then build your columns accordingly. However, you may not always have control over the CSV data file. It may be maintained by someone else, or you may be using a service that dynamically generates the CSV. Quandl, for example, provides a REST API to access many curated datasets as CSV. While you could download the CSV files and modify them before using them in Fetcher, you would lose the benefit of the nightly data updates. In most cases it's better to request fresh files directly from the source. Fetcher provides two ways to alter the CSV file: pre_func specifies the method you want to run on the pandas dataframe containing the CSV immediately after it was fetched from the remote server. Your method can rename columns, reformat dates, slice or select data - it just has to return a dataframe. post_func is called after Fetcher has sorted the data based on your given date column. This method is intended for time series calculations you want to do on the entire dataset, such as timeshifting, calculating rolling statistics, or adding derived columns to your dataframe. Again, your method should take a dataframe and return a dataframe.
Fixed Basis Points Slippage
In the FixedBasisPointsSlippage model, the fill price of an order is a function of the order price, and a fixed percentage (in basis points). The basis_points constant (default 5) defines how large of an impact your order will have on the backtester's price calculation. The slippage is calculated by converting the basis_points constant to a percentage (5 basis points = 0.05%), and multiplying that percentage by the order price. Buys will fill at a price that is 0.05% higher than the close of the next minute, while sells will fill at a price that is 0.05% lower. A buy order for a stock currently selling at $100 per share would fill at $100.05 (100 + (0.0005 * 100)), while a sell order would fill at $99.95 (100 - (0.0005 * 100)). The volume_limit cap (default 0.10) limits the proportion of volume that your order can take up per bar. For example: suppose you want to place an order, and 1000 shares trade in each of the next several minutes, and the volume_limit is 0.10. If you place an order for 220 shares then your trade order will be split into three transactions (100 shares, 100 shares, and 20 shares). Setting the volume_limit to 1.00 will permit the backtester to use up to 100% of the bar towards filling your order. Using the same example, this will fill 220 shares in the next minute bar.
Using the Debugger
In the IDE, click on a line number in the gutter to set a breakpoint. A breakpoint can be set on any line except comments and method definitions. A blue marker appears once the breakpoint is set. To set a conditional breakpoint, right-click on a breakpoint's blue marker and click 'Edit Breakpoint'. Put in some Python code and the breakpoint will hit when this condition evaluates to true. In the example below, this breakpoint hits when the price of Apple is above $100. Conditional breakpoints are shown in yellow in the left-hand gutter You can set an unlimited number of breakpoints. Once the backtest has started, it will stop when execution gets to a line that has a breakpoint, at which point the backtest is paused and the debug window is shown. In the debugger, you can then query your variables, orders and portfolio state, data, and anything else used in your backtest. While the debugger window is active, you can set and remove other breakpoints. To inspect an object, enter it in the debug window and press enter. Most objects will be pretty-printed into a tree format to enable easy inspection and exploration. The following commands can be used in the debugger window:
Futures
There are a number of API functions specific to futures that make it easy to get data or trade futures contracts in your algorithms. A walkthrough of these tools is available in the Futures Tutorial.
Order Methods
There are many different ways to place orders from an algorithm. For most use-cases, we recommend using order_optimal_portfolio, which correctly accounts for existing open orders and allows for an easy transition to sophisticated portfolio optimization techniques. Algorithms that require explicit manual control of their orders can use the lower-level ordering functions, but should note that the family of order_target functions only consider the status of filled orders, not open orders, when making their calculations to the target position. This tutorial lesson demonstrates how you can prevent over-ordering. order_optimal_portfolio(objective, constraints) Place one or more orders by calculating a new optimal portfolio based on the objective defined by objective and constraints defined by constraints. See the Optimize API documentation for more details. order(asset, amount, style=OrderType) Places an order for the specified asset and the specified amount of shares (equities) or contracts (futures). Order type is inferred from the parameters used. If only asset and amount are used as parameters, the order is placed as a market order. Parameters asset: An Equity object or a Future object. amount: The integer amount of shares or contracts. Positive means buy, negative means sell. OrderType: (optional) Specifies the order style and the default is a market order. The available order styles are: style=MarketOrder(exchange) style=StopOrder(stop_price, exchange) style=LimitOrder(limit_price, exchange) style=StopLimitOrder(limit_price=price1, stop_price=price2, exchange) Click here to view an example for exchange routing. Returns An order id. order_value(asset, amount, style=OrderType) Place an order by desired value rather than desired number of shares. Placing a negative order value will result in selling the given value. Orders are always truncated to whole shares or contracts. Example Order AAPL worth up to $1000: order_value(symbol('AAPL'), 1000). If price of AAPL is $105 a share, this would buy 9 shares, since the partial share would be truncated (discarding slippage and transaction cost). The value of a position in a futures contract is computed to be the unit price times the number of units per contract (otherwise known as the size of the contract). Parameters asset: An Equity object or a Future object. amount: Floating point dollar value of shares or contracts. Positive means buy, negative means sell. OrderType: (optional) Specifies the order style and the default is a market order. The available order styles are: style=MarketOrder(exchange) style=StopOrder(stop_price, exchange) style=LimitOrder(limit_price, exchange) style=StopLimitOrder(limit_price=price1, stop_price=price2, exchange) Click here to view an example for exchange routing. Returns An order id. order_percent(asset, amount, style=OrderType) Places an order in the specified asset corresponding to the given percent of the current portfolio value, which is the sum of the positions value and ending cash balance. Placing a negative percent order will result in selling the given percent of the current portfolio value. Orders are always truncated to whole shares or contracts. Percent must be expressed as a decimal (0.50 means 50%). The value of a position in a futures contract is computed to be the unit price times the number of units per contract (otherwise known as the size of the contract). Example order_percent(symbol('AAPL'), .5) will order AAPL shares worth 50% of current portfolio value. If AAPL is $100/share and the portfolio value is $2000, this buys 10 shares (discarding slippage and transaction cost). Parameters asset: An Equity object or a Future object. amount: The floating point percentage of portfolio value to order. Positive means buy, negative means sell. OrderType: (optional) Specifies the order style and the default is a market order. The available order styles are: style=MarketOrder(exchange) style=StopOrder(stop_price, exchange) style=LimitOrder(limit_price, exchange) style=StopLimitOrder(limit_price=price1, stop_price=price2, exchange) Click here to view an example for exchange routing. Returns An order id. order_target(asset, amount, style=OrderType) Places an order to adjust a position to a target number of shares. If there is no existing position in the asset, an order is placed for the full target number. If there is a position in the asset, an order is placed for the difference between the target number of shares or contracts and the number currently held. Placing a negative target order will result in a short position equal to the negative number specified. Example If the current portfolio has 5 shares of AAPL and the target is 20 shares, order_target(symbol('AAPL'), 20) orders 15 more shares of AAPL. Parameters asset: An Equity object or a Future object. amount: The integer amount of target shares or contracts. Positive means buy, negative means sell. OrderType: (optional) Specifies the order style and the default is a market order. The available order styles are: style=MarketOrder(exchange) style=StopOrder(stop_price, exchange) style=LimitOrder(limit_price, exchange) style=StopLimitOrder(limit_price=price1, stop_price=price2, exchange) Click here to view an example for exchange routing. Returns An order id, or None if there is no difference between the target position and current position. order_target_value(asset, amount, style=OrderType) Places an order to adjust a position to a target value. If there is no existing position in the asset, an order is placed for the full target value. If there is a position in the asset, an order is placed for the difference between the target value and the current position value. Placing a negative target order will result in a short position equal to the negative target value. Orders are always truncated to whole shares or contracts. The value of a position in a futures contract is computed to be the unit price times the number of units per contract (otherwise known as the size of the contract). Example If the current portfolio holds $500 worth of AAPL and the target is $2000, order_target_value(symbol('AAPL'), 2000) orders $1500 worth of AAPL (rounded down to the nearest share). Parameters asset: An Equity object or a Future object. amount: Floating point dollar value of shares or contracts. Positive means buy, negative means sell. OrderType: (optional) Specifies the order style and the default is a market order. The available order styles are: style=MarketOrder(exchange) style=StopOrder(stop_price, exchange) style=LimitOrder(limit_price, exchange) style=StopLimitOrder(limit_price=price1, stop_price=price2, exchange) Click here to view an example for exchange routing. Returns An order id, or None if there is no difference between the target position and current position. order_target_percent(asset, percent, style=type) Place an order to adjust a position to a target percent of the current portfolio value. If there is no existing position in the asset, an order is placed for the full target percentage. If there is a position in the asset, an order is placed for the difference between the target percent and the current percent. Placing a negative target percent order will result in a short position equal to the negative target percent. Portfolio value is calculated as the sum of the positions value and ending cash balance. Orders are always truncated to whole shares, and percentage must be expressed as a decimal (0.50 means 50%). The value of a position in a futures contract is computed to be the unit price times the number of units per contract (otherwise known as the size of the contract). Example If the current portfolio value is 5% worth of AAPL and the target is to allocate 10% of the portfolio value to AAPL, order_target_percent(symbol('AAPL'), 0.1) will place an order for the difference, in this case ordering 5% portfolio value worth of AAPL. Parameters asset: An Equity object or a Future object. percent: The portfolio percentage allocated to the asset. Positive means buy, negative means sell. type: (optional) Specifies the order style and the default is a market order. The available order styles are: style=MarketOrder(exchange) style=StopOrder(stop_price, exchange) style=LimitOrder(limit_price, exchange) style=StopLimitOrder(limit_price=price1, stop_price=price2, exchange) Click here to view an example for exchange routing. Returns An order id, or None if there is no difference between the target position and current position. cancel_order(order) Attempts to cancel the specified order. Cancel is attempted asynchronously. Parameters order: Can be the order_id as a string or the order object. Returns None get_open_orders(sid) If asset is None or not specified, returns all open orders. If asset is specified, returns open orders for that asset Parameters sid: An Equity object or a Future object. Can also be None. Returns If asset is unspecified or None, returns a dictionary keyed by asset ID. The dictionary contains a list of orders for each ID, oldest first. If an asset is specified, returns a list of open orders for that asset, oldest first. get_order(order) Returns the specified order. The order object is discarded at the end of handle_data. Parameters order: Can be the order_id as a string or the order object. Returns An order object that is read/writeable but is discarded at the end of handle_data.
Trading Guards
There are several trading guards you can place in your algorithm to prevent unexpected behavior. All the guards are enforced when orders are placed. These guards are set in the initialize function. Set Restrictions On Specific Assets You can prevent the algorithm from trading specific assets by using set_asset_restrictions. If the algorithm attempts to order any asset that is restricted, it will stop trading and throw an exception. To avoid attempts to order any restricted assets, use the can_trade method, which will return False for assets that are restricted at that point in time. We've built a point-in-time asset restriction for you that includes all the leveraged ETFs. To use this asset restriction, use set_asset_restrictions(security_lists.restrict_leveraged_etfs). To get a list of leveraged ETFs at a given time, call security_lists.leveraged_etf_list.current_securities(dt), where dt is the current datetime, get_datetime(). For a trader trying to track their own leverage levels, these ETFs are a challenge. The contest prohibits trading in these ETFs for that reason. def initialize(context): # restrict leveraged ETFs set_asset_restrictions(security_lists.restrict_leveraged_etfs) def handle_data(context, data): # can_trade takes into account whether or not the asset is # restricted at a specific point in time. BZQ is a leveraged ETF, # so can_trade will always return False if data.can_trade(symbol('BZQ')): order_target(symbol('BZQ'), 100) else: print 'Cannot trade %s' % symbol('BZQ') Additionally, custom restrictions can be made to test how an algorithm would behave if particular assets were restricted. A custom StaticRestrictions can be created from a list containing assets that will be restricted for the entire simulation. from zipline.finance.asset_restrictions import StaticRestrictions def initialize(context): # restrict AAPL and MSFT for duration of the simulation context.restricted_assets = [symbol('AAPL'), symbol('MSFT')] set_asset_restrictions(StaticRestrictions(context.restricted_assets)) def handle_data(context, data): for asset in context.restricted_assets: # always returns False for both assets if data.can_trade(asset): order_target(asset, 100) else: print 'Cannot trade %s' % asset A custom HistoricalRestrictions can be created from a list of Restriction objects, each defined by an asset, effective_date and state. These restrictions define, for each restricted asset, specific time periods for which those restrictions are effective import pandas as pd from zipline.finance.asset_restrictions import ( HistoricalRestrictions, Restriction, RESTRICTION_STATES as states ) def initialize(context): # restrict AAPL from 2011-01-06 10:00 to 2011-01-07 10:59, inclusive # restrict MSFT from 2011-01-06 to 2011-01-06 23:59, inclusive historical_restrictions = HistoricalRestrictions([ Restriction(symbol('AAPL'), pd.Timestamp('2011-01-06 10:00', tz='US/Eastern'), states.FROZEN), Restriction(symbol('MSFT'), pd.Timestamp('2011-01-06', tz='US/Eastern'), states.FROZEN), Restriction(symbol('AAPL'), pd.Timestamp('2011-01-07 11:00', tz='US/Eastern'), states.ALLOWED), Restriction(symbol('MSFT'), pd.Timestamp('2011-01-07', tz='US/Eastern'), states.ALLOWED), ]) set_asset_restrictions(historical_restrictions) def handle_data(context, data): # returns True before 2011-01-06 10:00 and after 2011-01-07 10:59 if data.can_trade(symbol('AAPL')): order_target(symbol('AAPL'), 100) else: print 'Cannot trade %s' % symbol('AAPL') # returns True before 2011-01-06 and after 2011-01-06 23:59 if data.can_trade(symbol('MSFT')): order_target(symbol('MSFT'), 100) else: print 'Cannot trade %s' % symbol('MSFT') Long Only Specify long_only to prevent the algorithm from taking short positions. It does not apply to existing open orders or positions in your portfolio. def initialize(context): # Algorithm will raise an exception if it attempts to place an # order which would cause us to hold negative shares of any security. set_long_only() Maximum Order Count Sets a limit on the number of orders that can be placed by this algorithm in a single day. In the initialize function you can enter the set_max_order_count, it will have no effect if placed elsewhere in the algorithm. def initialize(context): # Algorithm will raise an exception if more than 50 orders are placed in a day set_max_order_count(50) Maximum Order Size Sets a limit on the size of any single order placed by this algorithm. This limit can be set in terms of number of shares, dollar value, or both. The limit can optionally be set for a given security; if the security is not specified, it applies to all securities. This must be run in the initialize function. def initialize(context): # Algorithm will raise an exception if we attempt to order more than # 10 shares or 1000 dollars worth of AAPL in a single order. set_max_order_size(symbol('AAPL'), max_shares=10, max_notional=1000.0) Maximum Position Size Sets a limit on the absolute magnitude of any position held by the algorithm for a given security. This limit can be set in terms of number of shares, dollar value, or both. A position can grow beyond this limit because of market movement; the limit is only imposed at the time the order is placed. The limit can optionally be set for a given security; if the security is not specified, it applies to all securities. This must be run in the initialize function. def initialize(context): # Algorithm will raise an exception if we attempt to hold more than # 30 shares or 2000 dollars worth of AAPL. set_max_position_size(symbol('AAPL'), max_shares=30, max_notional=2000.0)
Volume Share Slippage
In the VolumeShareSlippage model, the price you get is a function of your order size relative to the security's actual traded volume. You provide a volume_limit cap (default 0.025), which limits the proportion of volume that your order can take up per bar. For example: if the backtest is running in one-minute bars, and you place an order for 60 shares; then 1000 shares trade in each of the next several minute; and the volume_limit is 0.025; then your trade order will be split into three orders (25 shares, 25 shares, and 10 shares). Setting the volume_limit to 1.00 will permit the backtester to use up to 100% of the bar towards filling your order. Using the same example, this will fill 60 shares in the next minute bar. The price impact constant (default 0.1) defines how large of an impact your order will have on the backtester's price calculation. The slippage is calculated by multiplying the price impact constant by the square of the ratio of the order to the total volume. In our previous example, for the 25-share orders, the price impact is .1 * (25/1000) * (25/1000), or 0.00625%. For the 10-share order, the price impact is .1 * (10/1000) * (10/1000), or .001%.
Dataset Limits
Initial limit of 30 datasets (Note: currently datasets cannot be modified after submission) You can still load static historical data to research with local_csv, or to an algorithm with fetch_csv, but the data cannot be added to Pipeline in either scenario. Maximum of 20 columns Maximum file size of 300MB Maximum dataset name of 56 characters Live uploads will be processed on trading days from 07 to 10 am UTC
Dataset Upload Format
Initially, this feature supports reduced custom datasets that have a single row of data per asset per day in csv file format . This format fits naturally into the expected pipeline format.
Figuring out what assets came from Fetcher
It can be useful to know which assets currently have Fetcher signals for a given day. data.fetcher_assets returns a list of assets that, for the given backtest or live trading day, are active from your Fetcher file. Here's an example file with ranked securities: symbol, start date, stock_score AA, 2/13/12, 11.7 WFM, 2/13/12, 15.8 FDX, 2/14/12, 12.1 M, 2/16/12, 14.3 You can backtest the code below during the dates 2/13/2012 - 2/18/2012. When you use this sample file and algorithm, data.fetcher_assets will return, for each day: 2/13/2012: AA, WFM 2/14/2012: FDX 2/15/2012: FDX (forward filled because no new data became available) 2/16/2012: M Note that when using this feature in live trading, it is important that all historical data in your fetched data be accessible and unchanged. We do not keep a copy of fetched data; it is reloaded at the start of every trading day. If historical data in the fetched file is altered or removed, the algorithm will not run properly. New data should always be appended to (not overwritten over) your existing Fetcher .csv source file. In addition, appended rows cannot contain dates in the past. For example, on May 3rd, 2015, a row with the date May 2nd, 2015, could not be appended in live trading.
Importing Files from Dropbox
Many users find Dropbox to be a convenient way to access CSV files. To use Dropbox, place your file in the Public folder and use the 'Public URL'. A common mistake is to use a URL of format https://www.dropbox.com/s/abcdefg/filename.csv, which is a URL about the file, not the file itself. Instead, you should use the Public URL which has a format similar to https://dl.dropboxusercontent.com/u/1234567/filename.csv.
Fetcher - Load any CSV file
Note: The best way to upload a time-series csv, with accurate point-in-time data via Pipeline, is the Self-Serve data feature. Fetcher should not be used in algorithms attempting to enter the contest or receive an investment allocation. Quantopian provides historical data since 2002 for US equities in minute bars. The US market data provides a backbone for financial analysis, but some of the most promising areas of research are finding signals in non-market data. Fetcher provides your algorithm with access to external time series data. Any time series that can be retrieved as a CSV file via http or https can be incorporated into a Quantopian algorithm. Fetcher lets Quantopian download CSV files and use them in your simulations. To use it, use fetch_csv(url) in your initialize method. fetch_csv will download the CSV file and parse it into a pandas dataframe. You may then specify your own methods to modify the entire dataframe prior to the start of the simulation. During simulation, the rows of the CSV/dataframe are provided to your algorithm's handle_data and other functions as additional properties of the data parameter. Best of all, your Fetcher data will play nicely with Quantopian's other data features: Use record to plot a time series of your fetcher data. Your data will be streamed to your algorithm without look-ahead bias. That means if your backtest is currently at 10/01/2013 but your Fetcher data begins on 10/02/2013, your algorithm will not have access to the Fetcher data until 10/02/2013. You can account for this by checking the existence of your Fetcher field, see common errors for more information. Fetcher supports two kinds of time series: Security Information: data that is about individual securities, such as short interest for a stock Signals: data that stands alone, such as the Consumer Price Index, or the spot price of palladium For Security Info, your CSV file must have a column with header of 'symbol' which represents the symbol of that security on the date of that row. Internally, Fetcher maps the symbol to the Quantopian security id (sid). You can have many securities in a single CSV file. To access your CSV data in handle_data: ## This algo imports sample short interest data from a CSV file for one security, ## NFLX, and plots the short interest: def initialize(context): # fetch data from a CSV file somewhere on the web. # Note that one of the columns must be named 'symbol' for # the data to be matched to the stock symbol fetch_csv('https://dl.dropboxusercontent.com/u/169032081/fetcher_sample_file.csv', date_column = 'Settlement Date', date_format = '%m/%d/%y') context.stock = symbol('NFLX') def handle_data(context, data): record(Short_Interest = data.current(context.stock, 'Days To Cover')) Here is the sample CSV file. Note that for the Security Info type of import, one of the columns must be 'symbol'. Settlement Date,symbol,Days To Cover 9/30/13,NFLX,2.64484 9/13/13,NFLX,2.550829 8/30/13,NFLX,2.502331 8/15/13,NFLX,2.811858 7/31/13,NFLX,1.690317 For Signals, your CSV file does not need a symbol column. Instead, you provide it via the symbol parameter: def initialize(context): fetch_csv('https://yourserver.com/cpi.csv', symbol='cpi') def handle_data(context, data): # get the cpi for this date current_cpi = data.current('cpi','value') # plot it record(cpi=current_cpi)
Backtest results
Once a full backtest starts, we load all the trading events for the securities that your algorithm specified, and feed them to your algorithm in time order. Results will start streaming in momentarily after the backtest starts. Here is a snapshot of a backtest results page. Mouse over each section to learn more. Backtest settings and status: Shows the initial settings for the backtest, the progress bar when the backtest is in progress, and the final state once the test is done. If the backtest is cancelled, exceeds its max daily loss, or have runtime errors, that information will be displayed here. Result details: Here's where you dive into the details of your backtest results. You can examine every transaction that occurred during the backtest, see how your positions evolved over time, and look at detailed risk metrics. For the risk metrics, we show you 1, 3, 6, and 12-month windows to provide more granular breakdowns Overall results: This is the overall performance and risk measures of your backtest. These numbers will update during the course of the backtest as new data comes in. Cumulative performance and benchmark overlay: Shows your algorithm's performance over time (in blue) overlaid with the benchmark (in red). Daily and weekly P/L: Shows your P/L per day or week, depending on the date range selected. Transactions chart: Shows all the cumulative dollar value of all the buys and sells your algorithm placed, per day or week. Buys are shown as positive blue, and sells as negative reds.
Set Up Live Data
Once your historical data has successfully validated, you'll navigate to the Set Up Live Data tab. You can configure your Connection Type settings to download a new file daily. In addition to the standard FTP option, live data can be downloaded from hosted CSV files like Google Sheets, Dropbox or any API service that supports token-based urls (vs authentication). Each trading day, between 07 to 10 am UTC, this live file will be downloaded and compared against existing dataset records. Brand new records will be added to the base tables and updates will be added to the deltas table based on the Partner Data process logic. Note: Column names must be identical between the historical and live files. The live data download will use the same column types as configured during the historical upload. Clicking submit will add your dataset to the historical dataset ingestion queue. You can navigate to your specific dataset page by clicking on the Dataset name in the Self-Serve Data section. This page is only viewable to you and includes the First Load Date, and code samples for pipeline use.
Validation
Our IDE has extensive syntax and validation checks. It makes sure your algorithm is valid Python, fulfills our API, and has no obvious runtime exceptions (such as dividing by zero). You can run the validation checks by clicking on the Build button (or pressing control-B), and we'll run them automatically right before starting a new backtest. Errors and warnings are shown in the window on the right side of the IDE. Here's an example where the log line is missing an end quote. When all errors and warnings are resolved, the Build button kicks off a quick backtest. The quick backtest is a way to make sure that the algorithm roughly does what you want it to, without any errors. Once the algorithm is running roughly the way you'd like, click the 'Full Backtest' button to kick off a full backtest with minute-bar data.
Fundamental Data
Quantopian provides fundamental data from Morningstar, available in research and backtesting. The data covers over 8,000 companies traded in the US with over 900 metrics. Fundamental data can be accessed via the Pipeline API. A full listing of the available fundamental fields can be found at the Fundamentals Reference page. Fundamental data is updated on a daily basis on Quantopian. A sample algorithm is available showing how to access fundamental data. "as of" Dates Each fundamental data field has a corresponding as_of field, e.g. shares_outstanding also has shares_outstanding_as_of. The as_of field contains the relevant time period of the metric, as a Python date object. Some of the data in Morningstar is quarterly (revenue, earnings, etc.), while other data is daily/weekly (market cap, P/E ratio, etc.). Each metric's as_of date field is set to the end date of the period to which the metric applies. For example, if you use a quarterly earnings metric like shares_outstanding, the accompanying date field shares_outstanding_as_of will be set to the end date of the relevant quarter (for example, June 30, 2014). The as_of date indicates the date upon which the measured period ends.
Slippage Models
Slippage is where our backtester calculates the realistic impact of your orders on the execution price you receive. When you place an order for a trade, your order affects the market. Your buy order drives prices up, and your sell order drives prices down; this is generally referred to as the 'price impact' of your trade. The size of the price impact is driven by how large your order is compared to the current trading volume. The slippage method also evaluates if your order is simply too big: you can't trade more than market's volume, and generally you can't expect to trade more than a fraction of the volume. All of these concepts are wrapped into the slippage method. When an order isn't filled because of insufficient volume, it remains open to be filled in the next minute. This continues until the order is filled, cancelled, or the end of the day is reached when all orders are cancelled. Slippage must be defined in the initialize method. It has no effect if defined elsewhere in your algorithm. To set slippage, use the set_slippage method and pass in FixedBasisPointsSlippage, FixedSlippage, VolumeShareSlippage, or a custom slippage model that you define. def initialize(context): set_slippage(slippage.FixedBasisPointsSlippage(basis_points=5, volume_limit=0.1)) If you do not specify a slippage method, slippage for US equities defaults to the FixedBasisPointsSlippage model at 5 basis points fixed slippage. Futures follow a special volatility volume share model that uses 20-day annualized volatility as well as 20-day trading volume to model the price impact and fill rate of a trade. The model varies depending on which trading future and is explained in depth here. To set a custom slippage model for futures, pass a FixedSlippage model as a us_future keyword argument to set_slippage(). # Setting custom equity and futures slippage models. def initialize(context): set_slippage( us_equities=slippage.FixedBasisPointsSlippage(basis_points=5, volume_limit=0.1), us_futures=slippage.FixedSlippage(spread=0) )
Upcoming Contract Chain
Sometimes, knowing the forward looking chain of contracts associated with a particular future can be helpful. For example, if you want to trade back contracts, you will need to reference contracts with delivery dates several months into the future. This can be done with the data.current_chain() function. def initialize(context): # Crude Oil Continuous Future context.future = continuous_future('CL') schedule_function(daily_func, date_rules.every_day(), time_rules.market_open()) def daily_func(context, data): cl_chain = data.current_chain(context.future) front_contract = cl_chain[0] secondary_contract = cl_chain[1] tertiary_contract = cl_chain[2]
Sector Loadings
These classes provide access to sector loadings computed by the Quantopian Risk Model. class BasicMaterials Quantopian Risk Model loadings for the basic materials sector. class ConsumerCyclical Quantopian Risk Model loadings for the consumer cyclical sector. class FinancialServices Quantopian Risk Model loadings for the financial services sector. class RealEstate Quantopian Risk Model loadings for the real estate sector. class ConsumerDefensive Quantopian Risk Model loadings for the consumer defensive sector. class HealthCare Quantopian Risk Model loadings for the health care sector. class Utilities Quantopian Risk Model loadings for the utilities sector. class CommunicationServices Quantopian Risk Model loadings for the communication services sector. class Energy Quantopian Risk Model loadings for the communication energy sector. class Industrials Quantopian Risk Model loadings for the industrials sector. class Technology Quantopian Risk Model loadings for the technology sector. Style Loadings These classes provide access to style loadings computed by the Quantopian Risk Model. class Momentum Quantopian Risk Model loadings for the "momentum" style factor. This factor captures differences in returns between stocks that have had large gains in the last 11 months and stocks that have had large losses in the last 11 months. class ShortTermReversal Quantopian Risk Model loadings for the "short term reversal" style factor. This factor captures differences in returns between stocks that have experienced short term losses and stocks that have experienced short term gains. class Size Quantopian Risk Model loadings for the "size" style factor. This factor captures difference in returns between stocks with high market capitalizations and stocks with low market capitalizations. class Value Quantopian Risk Model loadings for the "value" style factor. This factor captures differences in returns between "expensive" stocks and "inexpensive" stocks, measured by the ratio between each stock's book value and its market cap. class Volatility Quantopian Risk Model loadings for the "volatility" style factor. This factor captures differences in returns between stocks that experience large price fluctuations and stocks that have relatively stable prices.
Entry Points
The Optimize API has three entrypoints: quantopian.optimize.calculate_optimal_portfolio() calculates a portfolio that optimizes an objective subject to a list of constraints. quantopian.algorithm.order_optimal_portfolio() calculates a new portfolio and then places the orders necessary to achieve that portfolio. order_optimal_portfolio() can only be called from a trading algorithm. run_optimization() performs the same optimization as calculate_optimal_portfolio() but returns anOptimizationResult with additional information. order_optimal_portfolio(objective, constraints) Calculate an optimal portfolio and place orders toward that portfolio. Parameters: Raises: Returns: calculate_optimal_portfolio(objective, constraints, current_portfolio=None) Calculate optimal portfolio weights given objective and constraints. Parameters: Returns: Raises: Notes This function is a shorthand for calling run_optimization, checking for an error, and extracting the result's new_weightsattribute. If an optimization problem is feasible, the following are equivalent: # Using calculate_optimal_portfolio. >>> weights = calculate_optimal_portfolio(objective, constraints, portfolio) # Using run_optimization. >>> result = run_optimization(objective, constraints, portfolio) >>> result.raise_for_status() # Raises if the optimization failed. >>> weights = result.new_weights See also quantopian.optimize.run_optimization() run_optimization(objective, constraints, current_portfolio=None) Run a portfolio optimization. Parameters: Returns: See also quantopian.optimize.OptimizationResult, quantopian.optimize.calculate_optimal_portfolio()
Dividends
The Quantopian database holds over 150,000 dividend events dating from January 2002. Dividends are treated as events and streamed through the performance tracking system that monitors your algorithm during a backtest. Dividend events modify the security price and the portfolio's cash balance. In lookback windows, like history, prices are dividend-adjusted. Please review Data Sources for more detail. Dividends specify four dates: declared date is the date on which the company announced the dividend. record date is the date on which a shareholder must be recorded as an owner to receive a dividend payment. Because settlement can take 3 days, a second date is used to calculate ownership on the record date. ex date is 3 trading days prior to the record date. If a holder sells the security before this date, they are not paid the dividend. The ex date is when the price of the security is typically most affected. pay date is the date on which a shareholder receives the cash for a dividend. Security prices are marked down by the dividend amount on the open following the ex_date. The portfolio's cash position is increased by the amount of the dividend on the pay date. Quantopian chose this method so that cash positions are correctly maintained, which is particularly important when an algorithm is used for live trading. The downside to this method is that causes a lower portfolio value for the period between the two dates. In order for your algorithm to receive dividend cash payments, you must have a long position (positive amount) in the security as of the close of market on the trading day prior to the ex_date AND you must run the simulation through the pay date, which is typically about 60 calendar days later. If you are short the security at market close on the trading day prior to the ex_date, your algorithm will be required to pay the dividends due. As with long positions, the cash balance will be debited by the dividend payments on the pay date. This is to reflect the short seller's obligation to pay dividends to the entity that loaned the security. Special dividends (where more than 25% of the value of the company is involved) are not yet tracked in Quantopian. There are several hundred of these over the last 11 years. We will add these dividends to our data in the future. Dividends are not relayed to algorithms as events that can be accessed by the API; we will add that feature in the future.
Debugger
The debugger gives you a powerful way to inspect the details of a running backtest. By setting breakpoints, you can pause execution and examine variables, order state, positions, and anything else your backtest is doing.
Technical Details
The debugger is available in the IDE. It is not available on the Full Backtest screen. You can edit your code during a debugging session, but those edits aren't used in the debugger until a new backtest is started. After 10 minutes of inactivity in the IDE, any breakpoints will be suspended and the backtest will automatically finish running. After 50 seconds, the breakpoint commands will timeout. This is the same amount of time given for 1 handle_data call.
Setting a custom benchmark
The default benchmark in your algorithm is SPY, an ETF that tracks the S&P 500. You can change it in the initialize function by using set_benchmark and passing in another security. Only one security can be used for the benchmark, and only one benchmark can be set per algorithm. If you set multiple benchmarks, the last one will be used. Currently, the benchmark can only be set to an equity.
Experimental
The following experimental features have been recently added to the Optimize API. They are currently importable from quantopian.optimize.experimental. We expect to move them to quantopian.optimizeonce they've stabilized. class RiskModelExposure(risk_model_loadings, version=None, min_basic_materials=None, max_basic_materials=None, min_consumer_cyclical=None, max_consumer_cyclical=None, min_financial_services=None, max_financial_services=None, min_real_estate=None, max_real_estate=None, min_consumer_defensive=None, max_consumer_defensive=None, min_health_care=None, max_health_care=None, min_utilities=None, max_utilities=None, min_communication_services=None, max_communication_services=None, min_energy=None, max_energy=None, min_industrials=None, max_industrials=None, min_technology=None, max_technology=None, min_momentum=None, max_momentum=None, min_size=None, max_size=None, min_value=None, max_value=None, min_short_term_reversal=None, max_short_term_reversal=None, min_volatility=None, max_volatility=None) Constraint requiring bounded net exposure to the set of risk factors provided by the Quantopian Risk Model. Risk model loadings are specified as a DataFrame of floats whose columns are factor labels and whose index contains Assets. These are accessible via the Pipeline API using quantopian.pipeline.experimental.risk_loading_pipeline(). For each column in the risk_model_loadings frame, we constrain: (new_weights * risk_model_loadings[column]).sum() >= min_exposure[column] (new_weights * risk_model_loadings[column]).sum() <= max_exposure[column] The constraint provides reasonable default bounds for each factor, which can be used with: RiskModelExposure(risk_model_loadings, version=Newest) To override the default bounds, each factor has optional arguments to specify a custom minimum and a custom maximum bound: RiskModelExposure( risk_model_loadings, min_technology=-0.4, max_technology=0.4, version=Newest, ) The default values provided by RiskModelExposure are versioned. In the event that the defaults change in the future, you can control how RiskModelExposure's behavior will change using the version parameter. Passing version=opt.Newest causes RiskModelExposure to use the most recent defaults: RiskModelExposure(risk_model_loadings, version=opt.Newest) Using version=opt.Newest means that your algorithm's behavior may change in the future if the RiskModelExposure defaults change in a future release. Passing an integer for version causes RiskModelExposure to use that particular version of the defaults: RiskModelExposure(risk_model_loadings, version=0) Using a fixed default version means that your algorithm's behavior will not change if the RiskModelExposure defaults change in a future release. The only fixed version number currently available is version 0. Version 0 applies bounds of (-0.18, 0.18) for each sector factor, and (-0.36, 0.36) for each style factor. If no value is passed for version, RiskModelExposure will log a warning and use opt.Newest. Parameters: risk_model_loadings (pd.DataFrame) - An (assets x labels) frame of weights for each (asset, factor) pair, as provided by quantopian.pipeline.experimental.risk_loading_pipeline(). version (int, optional) - Version of default bounds to use. Pass opt.Newest to use the newest version. Default is Newest. min_basic_materials (float, optional) - Minimum net exposure value for the basic_materials sector risk factor. max_basic_materials (float, optional) - Maximum net exposure value for the basic_materials sector risk factor. min_consumer_cyclical (float, optional) - Minimum net exposure value for the consumer_cyclical sector risk factor. max_consumer_cyclical (float, optional) - Maximum net exposure value for the consumer_cyclical sector risk factor. min_financial_services (float, optional) - Minimum net exposure value for the financial_services sector risk factor. max_financial_services (float, optional) - Maximum net exposure value for the financial_services sector risk factor. min_real_estate (float, optional) - Minimum net exposure value for the real_estate sector risk factor. max_real_estate (float, optional) - Maximum net exposure value for the real_estate sector risk factor. min_consumer_defensive (float, optional) - Minimum net exposure value for the consumer_defensive sector risk factor. max_consumer_defensive (float, optional) - Maximum net exposure value for the consumer_defensive sector risk factor. min_health_care (float, optional) - Minimum net exposure value for the health_care sector risk factor. max_health_care (float, optional) - Maximum net exposure value for the health_care sector risk factor. min_utilities (float, optional) - Minimum net exposure value for the utilities sector risk factor. max_utilities (float, optional) - Maximum net exposure value for the utilities sector risk factor. min_communication_services (float, optional) - Minimum net exposure value for the communication_services sector risk factor. max_communication_services (float, optional) - Maximum net exposure value for the communication_services sector risk factor. min_energy (float, optional) - Minimum net exposure value for the energy sector risk factor. max_energy (float, optional) - Maximum net exposure value for the energy sector risk factor. min_industrials (float, optional) - Minimum net exposure value for the industrials sector risk factor. max_industrials (float, optional) - Maximum net exposure value for the industrials sector risk factor. min_technology (float, optional) - Minimum net exposure value for the technology sector risk factor. max_technology (float, optional) - Maximum net exposure value for the technology sector risk factor. min_momentum (float, optional) - Minimum net exposure value for the momentum style risk factor. max_momentum (float, optional) - Maximum net exposure value for the momentum style risk factor. min_size (float, optional) - Minimum net exposure value for the size style risk factor. max_size (float, optional) - Maximum net exposure value for the size style risk factor. min_value (float, optional) - Minimum net exposure value for the value style risk factor. max_value (float, optional) - Maximum net exposure value for the value style risk factor. min_short_term_reversal (float, optional) - Minimum net exposure value for the short_term_reversal style risk factor. max_short_term_reversal (float, optional) - Maximum net exposure value for the short_term_reversal style risk factor. min_volatility (float, optional) - Minimum net exposure value for the volatility style risk factor. max_volatility (float, optional) - Maximum net exposure value for the volatility style risk factor.
In Algorithm Use
The pipeline dataset is the method of accessing your data in algorithms from quantopian.pipeline.data.<UserId> import <dataset> as my_dataset The example long short equity algorithm lecture is a good example that leverages a publicly available dataset. You'll want to replace the following stocktwits import with your specific dataset and then update or replace the sentiment_score with a reference to your dataset column (adjusting the SimpleMovingAverage factor as needed). from quantopian.pipeline.data.psychsignal import stocktwits sentiment_score = SimpleMovingAverage( inputs=[stocktwits.bull_minus_bear], window_length=3, )
Historical Dataset Ingestion
The typical delay from dataset submission to full availability is approximately 15 minutes. You can monitor the status by checking the load_metrics data in research.
Commission Models
To set the cost of your trades, use the set_commission method and pass in PerShare or PerTrade. Like the slippage model, set_commission must be used in the initialize method and has no effect if used elsewhere in your algorithm. If you don't specify a commission, your backtest defaults to $0.001 per share with a $0 minimum cost per trade. def initialize(context): set_commission(commission.PerShare(cost=0.001, min_trade_cost=0)) Commissions are taken out of the algorithm's available cash. Regardless of what commission model you use, orders that are cancelled before any fills occur do not incur any commission. The default commission model for US equities is PerShare, at $0.001 per share and $0 minimum cost per order. The first fill will incur at least the minimum commission, and subsequent fills will incur additional commission. The PerTrade commission model applies a flat commission upon the first fill. The default commission model for US futures is similar to retail brokerage pricing. Note that the model on Quantopian includes the $0.85 per contract commission that a broker might charge as well as the per-trade fees charged by the exchanges. The exchange fees vary per asset and are listed on this page under the heading "Exchange and Regulatory Fees". To set a custom commission model for futures, pass a commission model as a us_future keyword argument to set_commission(). The available commission models for futures are a PerContract model that applies a cost and an exchange fee per contract traded, and a PerFutureTrade model that applies a cost per trade. # Setting custom equity and futures commission models. def initialize(context): set_commission( us_equities=commission.PerShare(cost=0.001, min_trade_cost=0), us_futures=commission.PerContract(cost=1, exchange_fee=0.85, min_trade_cost=0) ) You can see how much commission has been associated with an order by fetching the order using get_order and then looking at its commission field.
Set Up Live Data From Dropbox
To use Dropbox, place your file in the Public folder and use the 'Public URL'. The Public URL has a format similar to https://dl.dropboxusercontent.com/s/1234567/filename.csv.
Uploading Your Data
Uploading your data is done in 1 or 2 steps. The first step is an upload of historical data, and the (optional) second step is to set up the continuous upload of live data. When uploading your historical data to Quantopian you will be required to declare each column type. Once the column types have been declared we will validate that all the historical data can be processed.
Monitoring DataSet Loads
We have surfaced an interactive load_metrics dataset that will allow you to easily monitor the details and status of your dataset historical loads and daily live updates. from odo import odo from quantopian.interactive.data.user_<UserId> import load_metrics lm = odo(load_metrics[['timestamp','dataset','status','rows_received','total_rows' ,'rows_added','delta_rows_added','last_updated','time_elapsed' ,'filenames_downloaded','source_last_updated','bytes_downloaded' ,'db_table_size','error' ]].sort('timestamp',ascending=False), pd.DataFrame) lm
Live Trading and Fetcher
When Fetcher is used in Live Trading or Paper Trading, the fetch_csv() command is invoked once per trading day, when the algorithm warms up, before market open. It's important that the fetched data with dates in the past be maintained so that warm up can be performed properly; Quantopian does not keep a copy of your fetched data, and algorithm warmup will not work properly if past data is changed or removed. Data for 'today' and dates going forward can be added and updated. Any updates to the CSV file should happen before midnight Eastern Time for the data to be ready for the next trading day.
Working With Multiple Data Frequencies
When pulling in external data, you need to be careful about the data frequency to prevent look-ahead bias. All Quantopian backtests are run on minutely frequency. If you are fetching daily data, the daily row will be fetched at the beginning of the day, instead of the end of day. To guard against the bias, you need to use the post_func function. For more information see post_func API documentation or take a look at this example algorithm. History() is not supported on fetched data. For a workaround, see this community forum post. For more information about Fetcher, go to the API documentation or look at the sample algorithms.
Point in Time
When using fundamental data in Pipeline, Quantopian doesn't use the as_of date for exposing the data to your algorithm. Rather, a new value is exposed at the "point in time" in the past when the metric would be known to you. This day is called the file date. Companies don't typically announce their earnings the day after the period is complete. If a company files earnings for the period ending June 30th (the as_of date), the file date (the date upon which this information is known to the public) is about 45 days later. Quantopian takes care of this logic for you in both research and backtesting. For data updates since Quantopian began subscribing to Morningstar's data, Quantopian tracks the file date based on when the information changes in Morningstar. For historic changes, Morningstar also provides a file date to reconstruct how the data looked at specific points in time. In circumstances where the file date is not known to Quantopian, the file date is defaulted to be 45 days after the as_of date. For a more technical explanation of how we store data in a "Point in Time" fashion, see this post in the community.
Fixed Slippage
When using the FixedSlippage model, the size of your order does not affect the price of your trade execution. You specify a 'spread' that you think is a typical bid/ask spread to use. When you place a buy order, half of the spread is added to the price; when you place a sell order, half of the spread is subtracted from the price. Fills under fixed slippage models are not limited to the amount traded in the minute bar. In the first non-zero-volume bar, the order will be completely filled. This requires you to be careful about ordering; naive use of fixed slippage models will lead to unrealistic fills, particularly with large orders and/or illiquid securities.
Custom Slippage
You can build a custom slippage model that uses your own logic to convert a stream of orders into a stream of transactions. In the initialize() function you must specify the slippage model to be used and any special parameters that the slippage model will use. Example: def initialize(context): set_slippage(MyCustomSlippage()) Your custom model must be a class that inherits from slippage.SlippageModel and implements process_order(self, data, order). The process_order method must return a tuple of (execution_price, execution_volume), which signifies the price and volume for the transaction that your model wants to generate. The transaction is then created for you. Your model gets passed the same data object that is passed to your other functions, letting you do any price or history lookup for any security in your model. The order object contains the rest of the information you need, such as the asset, order size, and order type. The order object has the following properties: amount (float), asset (Asset), stop and limit (float), and stop_reached and limit_reached (boolean). The trade_bar object is the same as data[sid] in handle_data and has open_price, close_price, high, low, volume, and sid. The slippage.create_transaction method takes the given order, the data object, and the price and amount calculated by your slippage model, and returns the newly constructed transaction. Many slippage models' behavior depends on how much of the total volume traded is being captured by the algorithm. You can use self.volume_for_bar to see how many shares of the current security have been traded so far during this bar. If your algorithm has many different orders for the same stock in the same bar, this is useful for making sure you don't take an unrealistically large fraction of the traded volume. If your slippage model doesn't place a transaction for the full amount of the order, the order stays open with an updated amount value, and will be passed to process_order on the next bar. Orders that have limits that have not been reached will not be passed to process_order. Finally, if your transaction has 0 shares or more shares than the original order amount, an exception will be thrown. Please see the sample custom slippage model.
Collaboration
You can collaborate in real-time with other Quantopian members on your algorithm. In the IDE, press the "Collaborate" button and enter your friend's email address. Your collaborators will receive an email letting them know they have been invited. They will also see your algorithm listed in their Algorithms Library with a collaboration icon. If your collaborator isn't yet a member of Quantopian, they will have to register before they can see your algorithm. The collab experience is fully coordinated: The code is changed on all screens in real time. When one collaborator "Builds" a backtest, all of the collaborators see the backtest results, logging, and/or errors. There is a chat tab for you to use while you collaborate. Technical Details Only the owner can invite other collaborators, deploy the algorithm for live trading, or delete the algorithm. There isn't a technical limit on number of collaborators, but there is a practical limit. The more collaborators you have, the more likely that you'll notice a performance problem. We'll improve that in the future. To connect with a member, search their profile in the forums and send them a private message. If they choose to share their email address with you, then you can invite them to collaborate.
Set Up Live DataFrom Google Sheets
You can import a CSV from your Google Sheets spreadsheet. To get the public URL for your file: Click on File > Publish to the web. Change the 'web page' option to 'comma-separated values (.csv)'. Click the Publish button. Copy and paste the URL that has a format similar to https://docs.google.com/spreadsheets/d/e/2PACX-1vQqUUOUysf9ETCvRCZS8EGnFRc4k7yd8IUshkNyfFn2NEeuCGNBFXfkXxyZr9HOYQj34VDp_GWlX_NX/pub?gid=1739886979&single=true&output=csv. Google Sheets has powerful IMPORTDATA and QUERY functions that can be used to download API data, rename columns and filter by columns/dates etc in an automatic fashion. QUERY(Sample!A1:L,"select A,K,L,D,E,B,G,H WHERE A >= date '2018-03-01'")
Running Backtests
You can set the start date, end date, starting capital, and trading calendar used by the backtest in the IDE. SPY, an ETF tracking the S&P 500, is the benchmark used for algorithms simulated in the backtest. It represents the total returns and reinvests the dividends to model the market performance. You can also set your own benchmark in your algorithm. To create a new backtest, click the 'Run Full Backtest' button from the IDE. That button appears once your algorithm successfully validates. Press the 'Build' button to start the validation if the 'Run Full Backtest' button is not visible in the upper-right of the IDE. We also provide a Backtests Page that has a summary of all backtests run against an algorithm. To go to the Backtests page, either click the Backtest button at the top right of the IDE, or, from the My Algorithms page click the number of backtests that have been run. The backtests page lists all the backtests that have been run for this algorithm, including any that are in progress. You can view an existing or in-progress backtest by clicking on it. Closing the browser will not stop the backtest from running. Quantopian runs in the cloud and it will continue to execute your backtest until it finishes running. If you want to stop the backtest, press the Cancel button.
API Documentation Methods to implement
Your algorithm is required to implement one method: initialize. Two other methods, handle_data and before_trading_start are optional. initialize(context) Called once at the very beginning of a backtest. Your algorithm can use this method to set up any bookkeeping that you'd like. The context object will be passed to all the other methods in your algorithm. Parameters context: An initialized and empty Python dictionary. The dictionary has been augmented so that properties can be accessed using dot notation as well as the traditional bracket notation. Returns None Examples def initialize(context): context.notional_limit = 100000 handle_data(context, data) Called every minute. Parameters context: Same context object in initialize, stores any state you've defined, and stores portfolio object. data: An object that provides methods to get price and volume data, check whether a security exists, and check the last time a security traded. Returns None Example def handle_data(context, data): # all your algorithm logic here # ... order_target_percent(symbol('AAPL'), 1.0) # ... before_trading_start(context, data) Optional. Called daily prior to the open of market. Orders cannot be placed inside this method. The primary purpose of this method is to use Pipeline to create a set of securities that your algorithm will use. Parameters context: Same context object as in handle_data. data: Same data object in handle_data. Returns None
Important Concepts
Your custom data will be processed similar to Quantopian Partner Data. In order to accurately represent your data in pipeline and avoid lookahead bias, your data will be collected, stored, and surfaced in a point-in-time nature. You can learn more about our process for working with point-in-time data here: How is it Collected, Processed, and Surfaced? . A more detailed description, Three Dimensional Time Working with Alternative Data, is also available. Once on the platform, your datasets are only importable and visible by you. If you share an algorithm or notebook that uses one of your private datasets, other community members won't be able to import the dataset. Your dataset will be downloaded and stored on Quantopian maintained servers where it is encrypted at rest.
Earnings Calendars
Zipline implements an abstract definition for working with EarningsCalendar datasets. Since more than one of Quantopian's data partners can provide earnings announcement dates, Quantopian implements multiple distinct calendar datasets, as well as multiple distinct versions of the BusinessDaysUntilNextEarnings, BusinessDaysSincePreviousEarnings factors, which rely in the columns provided by each calendar. In general, datasets specific to a particular vendor are imported from quantopian.pipeline.data.<vendor>, while factors that depend on data from a specific vendor are imported from quantopian.pipeline.factors.<vendor>. EventVestor All datasets listed here are importable from quantopian.pipeline.data.eventvestor. class EarningsCalendar Dataset providing dates of upcoming and recently announced earnings. Backed by data from EventVestor. next_announcement = EarningsCalendar.next_announcement::datetime64[ns] previous_announcement = EarningsCalendar.previous_announcement::datetime64[ns] All factors listed here are importable from quantopian.pipeline.factors.eventvestor. class BusinessDaysUntilNextEarnings(*args, **kwargs) Wrapper for the vendor's next earnings factor. class BusinessDaysSincePreviousEarnings(*args, **kwargs) Wrapper for the vendor's next earnings factor. Risk Model (Experimental) Functions and classes listed here provide access to the outputs of the Quantopian Risk Model via the Pipeline API. They are currently importable from quantopian.pipeline.experimental.
Zipline
Zipline is our open-sourced engine that powers the backtester in the IDE. You can see the code repository in Github and contribute pull requests to the project. There is a Google group available for seeking help and facilitating discussions. For other questions, please contact [email protected]. You can use Zipline to develop your strategy offline and then port to Quantopian for paper trading and live trading using the get_environment method.
Objectives
class Objective Base class for objectives. class TargetWeights(weights) Objective that minimizes the distance from an already-computed portfolio. Parameters: Notes A target value of 1.0 indicates that 100% of the portfolio's current net liquidation value should be held in a long position in the corresponding asset. A target value of -1.0 indicates that -100% of the portfolio's current net liquidation value should be held in a short position in the corresponding asset. Assets with target values of exactly 0.0 are ignored unless an algorithm has an existing position in the given asset. If an algorithm has an existing position in an asset and no target weight is provided, the target weight is assumed to be zero. class MaximizeAlpha(alphas) Objective that maximizes weights.dot(alphas) for an alpha vector. Ideally, alphas should contain coefficients such that alphas[asset] is proportional to the expected return of asset for the time horizon over which the target portfolio will be held. In the special case that alphas is an estimate of expected returns for each asset, this objective simply maximizes the expected return of the total portfolio. Parameters: Notes This objective should almost always be used with a MaxGrossExposure constraint, and should usually be used with a PositionConcentration constraint. Without a constraint on gross exposure, this objective will raise an error attempting to allocate an unbounded amount of capital to every asset with a nonzero alpha. Without a constraint on individual position size, this objective will allocate all of its capital in the single asset with the largest expected return.
Results and Errors
class OptimizationResult The result of an optimization. raise_for_status() Raise an error if the optimization did not succeed. print_diagnostics() Print diagnostic information gathered during the optimization. old_weights pandas.Series Portfolio weights before the optimization. new_weights pandas.Series or None New optimal weights, or None if the optimization failed. diagnostics quantopian.optimize.Diagnostics Object containing diagnostic information about violated (or potentially violated) constraints. status str String indicating the status of the optimization. success class:bool True if the optimization successfully produced a result. class InfeasibleConstraints Raised when an optimization fails because there are no valid portfolios. This most commonly happens when the weight in some asset is simultaneously constrained to be above and below some threshold. class UnboundedObjective Raised when an optimization fails because at least one weight in the 'optimal' portfolio is 'infinity'. More formally, raised when an optimization fails because the value of an objective function improves as a value being optimized grows toward infinity, and no constraint puts a bound on the magnitude of that value. class OptimizationFailed Generic exception raised when an optimization fails a reason with no special metadata.