Weekly Research Group 2021-04-29 - BigBird Code Example

You can watch the recording for this week’s session here: https://youtu.be/VSPgnGDc69E

So far, I’ve struggled to get BigBird to outperform the original BERT (using the simple strategy of truncating text for BERT). This week, the group helped me figure out how we might craft a code example that best demonstrates where BigBird might be most applicable or useful.

In the process, we touched on:

  • The authors’ recommendation to only use Sparse Attention above 1,024 tokens.
  • Why BigBird is valuable for Question Answering.
  • Possible strategies for addressing GPU memory concerns with BigBird.

You can view my “Research Journal” doc here: BigBird Research Journal - Google Docs

Outside of BigBird, we also talked about how to use a classifier to help label a large unlabeled dataset, and strategies for detecting the author of a piece of text.

You can view our notes from the Q&A discussion in the following doc: Discussion Group Q&A 2021 Q2 - Google Docs

I’ll be implementing the group’s suggestions this week, and for the next group we’ll see how it went!

1 Like