Successfully reported this slideshow.
Your SlideShare is downloading. ×

The Promise and Peril of Very Big Models

Ad

The Promise & Peril of
Very Big Models
Scaling AI, O’Reilly
September 2021

Ad

QUICK POLL
1. August is in winter:
a. True
b. False
The
Promise
&
the
Peril
/
rotational.io

Ad

QUICK POLL
2. If a name ends in a vowel, it is most likely:
a. Male
b. Female
c. I don’t know
The
Promise
&
the
Peril
/
ro...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 20 Ad
1 of 20 Ad

The Promise and Peril of Very Big Models

Download to read offline

In the machine learning community, we're trained to think of size as inversely proportional to bias, driving us to ever larger datasets, increasingly complex model architectures, and ever better accuracy scores. But bigger doesn't always mean better.

What data quality issues emerge in large datasets? What complications surface as features become more geodistributed (e.g., diurnal patterns, seasonal variations, datetime formatting, multilingual text, etc.)? What happens as models attempt to extrapolate bigger and bigger patterns? Why is it that the pursuit of megamodels has driven a wedge between the ML definition of “bias” and the more colloquial sense of the word?

Perhaps the time has come to move away from monolithic models that reduce rich variations and complexities to a simple argmax on the output layer and instead embrace a new generation of model architectures that are just as organic and diverse as the data they seek to encode.

In the machine learning community, we're trained to think of size as inversely proportional to bias, driving us to ever larger datasets, increasingly complex model architectures, and ever better accuracy scores. But bigger doesn't always mean better.

What data quality issues emerge in large datasets? What complications surface as features become more geodistributed (e.g., diurnal patterns, seasonal variations, datetime formatting, multilingual text, etc.)? What happens as models attempt to extrapolate bigger and bigger patterns? Why is it that the pursuit of megamodels has driven a wedge between the ML definition of “bias” and the more colloquial sense of the word?

Perhaps the time has come to move away from monolithic models that reduce rich variations and complexities to a simple argmax on the output layer and instead embrace a new generation of model architectures that are just as organic and diverse as the data they seek to encode.

Advertisement
Advertisement

More Related Content

Advertisement

The Promise and Peril of Very Big Models

  1. 1. The Promise & Peril of Very Big Models Scaling AI, O’Reilly September 2021
  2. 2. QUICK POLL 1. August is in winter: a. True b. False The Promise & the Peril / rotational.io
  3. 3. QUICK POLL 2. If a name ends in a vowel, it is most likely: a. Male b. Female c. I don’t know The Promise & the Peril / rotational.io
  4. 4. QUICK POLL 3. What is this drink called? a. Soda b. Cola c. Pop d. Something else The Promise & the Peril / rotational.io
  5. 5. Dr. Rebecca Bilbro The Promise & the Peril / rotational.io ● Founder & CTO, Rotational Labs, LLC ● Adjunct Faculty, Georgetown University ● Applied Text Analysis with Python, O’Reilly ● Co-Creator and Maintainer, Scikit-Yellowbrick here
  6. 6. What if data systems were a little smarter? rotational.io
  7. 7. TALK OVERVIEW Addressing the parrot in the room MOTIVATIONS Encoding locale, cohort, and context as features STRATEGIES Imagining new model and app architectures TAKEAWAYS The Promise & the Peril / rotational.io
  8. 8. “ The Promise & the Peril / rotational.io The Internet is a large and diverse virtual space, and accordingly, it is easy to imagine that very large datasets... must therefore be broadly representative of the ways in which different people view the world. However…” —Bender, et al (2021) On the Dangers of Stochastic Parrots.
  9. 9. MEGA MODEL PROBLEMS Magnify the biases learned from the training data Large carbon footprint of model training Filter out voices of marginalized people “Ersatz fluency” - the impression of coherence without responsibility The Promise & the Peril / rotational.io
  10. 10. A recently retired Computer Science professor* was preparing to box up her campus office... The Promise & the Peril / rotational.io *my mother
  11. 11. MODELING SENTIMENT The Promise & the Peril / rotational.io Do people express joy, anger, sadness, sarcasm the same way everywhere? “I’m chuffed to bits” “I’m happier than a pig in s***” “You little ripper”
  12. 12. The Promise & the Peril / rotational.io def train_model(X, y, estimator): """ Split the data and train the model using the estimator """ X_train, X_test, y_train, y_test = tts(X, y) model = estimator.fit(X_train, y_train) y_pred = model.predict(X_test) score = compare(y_test, y_pred) return model, score if __name__ == "__main__": training_df = pd.load_csv( "global_dataset.csv" ) features = training_df[[ "comment", "timezone", "city"]].values target = training_df[ "sentiment"].values train_model(features, target, RealGoodClassifier())
  13. 13. The Promise & the Peril / rotational.io def make_cohorts(df, c_attrs): """ Use the c_attrs (list of cohort attributes) to group the dataframe and return cohorts """ cohorts = df.groupby(c_attrs) return cohorts Allow for cohort-based modeling “Will it hurt my accuracy score?” is the probably the wrong question. A better question is: “which groups of users are least well served by the current model, and how can I improve their experience?”
  14. 14. What is a “cohort”? A group of users with something in common, e.g. ● physical location (city, region, etc) ● language or dialect (ps-AR, en-US, zh-HK, etc) ● convention (metric system, Hebrew calendar, etc) Especially if their UX or preferences might be unique, marginalized, or underrepresented (e.g. class imbalance). The Promise & the Peril / rotational.io
  15. 15. The Promise & the Peril / rotational.io sentiment comment raleigh london melbourne “chuffed to bits” neutral positive positive “happier than a pig in s***” positive negative neutral “you little ripper” negative neutral positive “i’m so pleased” positive positive positive “this is terrible” negative negative negative “door handle” neutral neutral neutral Allow communities to label the same data differently
  16. 16. The Promise & the Peril / rotational.io Architect for localized models rather than homogenized ones Global Model Management System And be careful to follow the rules!
  17. 17. The Promise & the Peril / rotational.io Create feedback loops attuned to cohort-based evaluation
  18. 18. The Promise & the Peril / rotational.io Allow new cohorts to emerge “What a good timepass” “I’m chuffed to bits”
  19. 19. REMEMBER THE REAL GOAL… HINT: IT’S NOT AN F1 SCORE OR LOSS MEASURE
  20. 20. THANK YOU Have something big to share? Even better, something small to share? rebecca@rotational.io

×